Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: faers
Title: R interface for FDA Adverse Event Reporting System
Version: 1.5.4
Version: 1.5.5
Authors@R:
c(
person("Yun", "Peng", , "yunyunp96@163.com", role = c("aut", "cre"), comment = c(ORCID = "0000-0003-2801-3332")),
Expand Down
252 changes: 56 additions & 196 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -28,21 +28,27 @@ cat("<!-- README.md is generated from README.Rmd. Please edit that file -->", se
[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/WangLabCSU/faers)
<!-- badges: end -->

The FDA Adverse Event Reporting System (FAERS) stands as a database dedicated to
the monitoring of post-marketing drug safety and exercises a notable influence
over FDA safety guidance documents, including the modification of drug labels.
The quantity of cases stored within FAERS has experienced an exponential surge
due to the refinement of submission techniques and adherence to standardized
data protocols, making it a pivotal asset for the realm of regulatory science.
While FAERS has predominantly focused on safety signal detection, the faers
package acts as the intermediary, seamlessly bridging the gap between the FAERS
database and the programming language R. Moreover, the faers package provides a
unified methodology for the seamless execution of pharmacovigilance analysis,
facilitating the integration of genetic tools in R. With an ultimate ambition
towards precision medicine, it aspires to scrutinize the vast expanse of the
human genome, revealing drug pathways that may be intricately tied to
potentially functional, population-differentiated polymorphisms.
Modern biologics, such as immune checkpoint inhibitors, exhibit complex toxicity profiles that are often underrepresented in pre-market clinical trials. While the FAERS database serves as a critical resource for real-world safety surveillance, its intricate relational structure and data inconsistencies pose significant barriers to large-scale epidemiological analyses.

To address these challenges, we developed `faers`, an end-to-end, reproducible framework for precision pharmacovigilance. The package streamlines the entire workflow—from raw data acquisition and rigorous preprocessing to signal detection—empowering researchers to transform vast spontaneous reporting data into actionable clinical insights.

<p align="center">
<img src="man/figures/workflow.png" width="80%" alt="faers Analysis Workflow">
</p>

## Key Features

- 📥 **Data Acquisition**: Automated downloading and parsing of FAERS quarterly data (supporting both ASCII and XML formats).

- 🛠️ **Rigorous Preprocessing**:Multi-quarter data merging and deduplication logic for data quality control.

- 🔍 **Terminology Standardization**: Integration with MedDRA and RxNorm for precise mapping of drugs and adverse events.

- 📊 **Advanced Signal Detection**: Comprehensive disproportionality analysis, including ROR, PRR, BCPNN, and EBGM.

- ⚡ **High-Performance Computing**: BiocParallel integration for memory-efficient, parallelized processing.

- 🌐 **Knowledge Integration**: Support for Athena drug vocabularies and Standardized MedDRA Queries (SMQ) for mechanism-driven research.

## Installation
To install from Bioconductor, use the following code:
Expand All @@ -68,204 +74,58 @@ if (!requireNamespace("pak")) {
pak::pkg_install("WangLabCSU/faers")
```

## Pharmacovigilance Analysis using FAERS

FAERS is a database for the spontaneous reporting of adverse events and
medication errors involving human drugs and therapeutic biological products.
This package accelarate the process of Pharmacovigilance Analysis using FAERS.
## Quick Start

```{r setup}
library(faers)
```

### Check metadata of FAERS
This will return a data.table reporting years, period, quarter, and file urls
and file sizes. By default, this will use the cached file in
`` tools::R_user_dir("`r faers:::pkg_nm()`", "cache") ``.
If it doesn't exist, the internal will parse metadata in
<`r sprintf("%s/extensions/FPD-QDE-FAERS/FPD-QDE-FAERS.html",
faers:::fda_host("fis"))`>

```{r eval=FALSE}
faers_meta()
```
The faers package provides a standardized pipeline that unifies complex pharmacovigilance workflows. For a comprehensive, step-by-step demonstration—including data acquisition and a complete Insulin case study—please refer to our detailed documentation:

An metadata copy was associated with the package, just set `internal = TRUE`.
```{r}
faers_meta(internal = TRUE)
```
The following workflow demonstrates how to perform a basic pharmacovigilance analysis for **Aspirin** using `faers`.

### Download and Parse quarterly data files from FAERS
The FAERS Quarterly Data files contain raw data extracted from the AERS database
for the indicated time ranges. The quarterly data files, which are available in ASCII or SGML formats, include:

- `demo`: demographic and administrative information
- `drug`: drug information from the case reports
- `reac`: reaction information from the reports
- `outc`: patient outcome information from the reports
- `rpsr`: information on the source of the reports
- `ther`: drug therapy start dates and end dates for the reported drugs
- `indi`: contains all "Medical Dictionary for Regulatory Activities" (MedDRA)
terms coded for the indications for use (diagnoses) for the reported drugs

Generally, we can use `faers()` function to download and parse all quarterly
data files from FAERS. Internally, the `faers()` function seamlessly utilizes
`faers_download()` and `faers_parse()` to preprocess each quarterly data file
from the FAERS repository. The default `format` was `ascii` and will return a
`FAERSascii` object. (xml format would also be okay , but presently, the XML
file receives only minimal support in the following process.)

Some variables has been added into specific field. See `?faers_parse` for
details.

```{r}
# Please make sure to replace dir with your own directory path, as the file
# included in the package is a sampled version.
data1 <- faers(2004, "q1",
dir = system.file("extdata", package = "faers"),
compress_dir = tempdir()
)
data1
```
```r
library(faers)

Furthermore, in cases where multiple quarterly data files are requisite, the
`faers_combine()` function is judiciously employed.
```{r}
data2 <- faers(c(2004, 2017), c("q1", "q2"),
dir = system.file("extdata", package = "faers"),
compress_dir = tempdir()
)
data2
```
# 1. Download and Parse Data (2023 Q1-Q2)
# Note: Ensure you have enough disk space in the target directory
data <- faers(2023, c("q1", "q2"), dir = "./faers_data")

You can use `faers_get()` to get specific field data, a data.table will be
returned.
```{r}
faers_get(data2, "demo")
```
# 2. Standardization (Requires MedDRA dictionary)
data_stand <- faers_standardize(data, meddra_path = "path/to/MedDRA")

### Standardize and De-duplication
The `reac` file provides the adverse drug reactions, where it includes the
“P.T.” field or the “Preferred Term” level terminology from the Medical
Dictionary for Regulatory Activities (MedDRA). The `indi` file contains the drug
indications, which also uses the “P.T.” level of MedDRA as a descriptor for the
drug indication. In this way, `MedDRA` was necessary to standardize this field
and add additional informations, such as `System Organ Classes`.
# 3. Deduplication (Requires Standardized data)
data_dedup <- faers_dedup(data_stand)

```{r, eval=FALSE}
# you must replace `meddra_path` with the path of uncompressed meddra data
data <- faers_standardize(data2, meddra_path)
# 4.Signal Detection (Data screening for items of interest is needed, such as "aspirin".)
results <- faers_phv_signal(
faers_filter(data_dedup, .fn = ~ drugname == "aspirin"),
.full = data_dedup
)
```

To proceed following steps, we just read a standardized data.
```{r}
data <- readRDS(system.file("extdata", "standardized_data.rds", package = "faers"))
data
```
## Documentation
The official documentation provides comprehensive guides for both clinical researchers and bioinformaticians.

The internal will save the complete MedDRA data in the `@meddra` slot, MedDRA
consists of two components: hierarchy and SMQ data. We can specify these
components using the use argument.
```{r}
faers_meddra(data)
faers_meddra(data, use = "hierarchy")
```
- 🌐 **[Official Website](https://MadDERt.github.io/faers/)**: The central portal for package overview and function references.

The internal will include a `meddra_hierarchy_idx` column that represents the
index of the MedDRA hierarchy data in the `indi` and `reac` field when
standardized. Additionally, the columns `meddra_hierarchy_from`, `meddra_code`,
and `meddra_pt` will also be added which provide standardized names of the
original PT (indi: indi_pt; reac: pt) (refer to `ASC_NTS.pdf` or `ASC_NTS.docx`
in the FAERS quarterly file for the meanings of the original names, most
original names will remain unchanged except for some names different between
FAERS quarterly files, see `?faers_parse` for details). We can retrieve this
data using the `faers_meddra()` function. When we use `faers_get()` to retrieve
`indi` or `reac` data from the standardized `FAERSascii` object, the meddra
hierarchy columns are automatically added to the returned data.table.
```{r}
faers_get(data, "indi")
```
- 🚀 **[Full Workflow Tutorial](https://MadDERt.github.io/faers/articles/full-workflow.html)**: A complete end-to-end analysis case study (e.g., Insulin-related adverse events).

```{r}
faers_get(data, "reac")
```
- 🧭 **[Getting Started: Portal](https://MadDERt.github.io/faers/articles/faers.html)**: Quick-start guide and roadmap for using the `faers` package.

One limitation of FAERS database is duplicate and incomplete reports. There are
many instances of duplicative reports and some reports do not contain all the
necessary information. We deemed two cases to be identical if they exhibited a
full concordance across drugs administered, and adverse reactions and but showed
discrepancies in one or none of the following fields: gender, age, reporting
country, event date, start date, and drug indications.
```{r}
data <- faers_dedup(data)
data
```
## Contributing

### Pharmacovigilance analysis
Pharmacovigilance is the science and activities relating to the detection,
assessment, understanding and prevention of adverse effects or any other
medicine/vaccine related problem.

To mine the signals of "insulin", we start by using the `faers_filter()`
function. In this function, the `.fn` argument should be a function that accepts
data specified in `.field`. It is important to note that `.fn` should always
return the `primaryid` that you want to keep.

To enhance our analysis, it would be advantageous to include all drug synonym names for `insulin`. These synonyms can be obtained by querying sources such as https://go.drugbank.com/ or alternative databases. Furthermore, we extract the brand names of insulin from the [Drugs@FDA](https://www.fda.gov/drugs/drug-approvals-and-databases/drugsfda-data-files) dataset, which can be easily obtained using the `fda_drugs()` function.
```{r}
insulin_names <- "insulin"
insulin_pattern <- paste(insulin_names, collapse = "|")
fda_insulin <- fda_drugs()[
grepl(insulin_pattern, ActiveIngredient, ignore.case = TRUE)
]
insulin_pattern <- paste0(
unique(tolower(c(insulin_names, fda_insulin$DrugName))),
collapse = "|"
)
insulin_data <- faers_filter(data, .fn = function(x) {
idx <- grepl(insulin_pattern, x$drugname, ignore.case = TRUE) |
grepl(insulin_pattern, x$prod_ai, ignore.case = TRUE)
x[idx, primaryid]
}, .field = "drug")
insulin_data
```
Contributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are **greatly appreciated**.

Then, signal can be easily obtained with `faers_phv_signal()` which internally
use `faers_phv_table()` to create a contingency table and use `phv_signal()` to
do signal analysis specified in `.methods` argument. By default, all supported
signal analysis methods will be run, including "ror", "prr", "chisq",
"bcpnn_norm", "bcpnn_mcmc", "obsexp_shrink", "fisher", and "ebgm".

The most important argument for this function is `.object`, which should be a
de-duplicated FAERSascii object containing the data for the drugs or traits of
interest. Additionally, you must specify either `.full`, which represents the
background distributions data (usually the entire FAERS data), or you can
specify `.object2`, which should be the control data or another drug of interest
for comparison.

```{r, warning=FALSE}
insulin_signals <- faers_phv_signal(insulin_data,
.full = data,
BPPARAM = BiocParallel::SerialParam(RNGseed = 1L)
)
insulin_signals
```
1. **Report Bugs**: Submit an [issue](https://github.com/WangLabCSU/faers/issues) if you find any calculation errors or data parsing failures.

The column containing the events of interest can be specified using an atomic
character in the `.events` (default: "soc_name") argument. The combination of
all specified columns will define the unique event. Additionally, we can control
which field data to find the columns in the `.field` (default: "reac") argument.
2. **Feature Requests**: Have an idea for a new signal detection algorithm? Open an issue to discuss it.

```{r, warning=FALSE}
insulin_signals_hlgt <- faers_phv_signal(
insulin_data,
.events = "hlgt_name", .full = data,
BPPARAM = BiocParallel::SerialParam(RNGseed = 1L)
)
insulin_signals_hlgt
```
3. **Pull Requests**:

## sessionInfo
```{r}
sessionInfo()
```
- Fork the project.

- Create your Feature Branch (`git checkout -b feature/AmazingFeature`).

- Commit your changes (`git commit -m 'Add some AmazingFeature'`).

- Push to the Branch (`git push origin feature/AmazingFeature`).

- Open a Pull Request.
Loading
Loading