R/bootstrap_enrichment_test.R
bootstrap_enrichment_test.Rd
bootstrap_enrichment_test
takes a genelist and a single cell type
transcriptome dataset and determines the probability of enrichment and fold
changes for each cell type.
bootstrap_enrichment_test( sct_data = NULL, hits = NULL, bg = NULL, genelistSpecies = NULL, sctSpecies = NULL, output_species = "human", reps = 100, annotLevel = 1, geneSizeControl = FALSE, controlledCT = NULL, mtc_method = "BH", sort_results = TRUE, verbose = TRUE )
sct_data | List generated using generate_celltype_data. |
---|---|
hits | List of gene symbols containing the target gene list.
Will automatically be converted to human gene symbols
if |
bg | List of gene symbols containing the background gene list
(including hit genes). If |
genelistSpecies | Species that |
sctSpecies | Species that |
output_species | Species to convert |
reps | Number of random gene lists to generate (Default: 100, but should be >=10,000 for publication-quality results). |
annotLevel | An integer indicating which level of |
geneSizeControl | Whether you want to control for
GC content and transcript length. Recommended if the gene list originates
from genetic studies (Default: FALSE).
If set to |
controlledCT | [Optional] If not NULL, and instead is the name of a cell type, then the bootstrapping controls for expression within that cell type. |
mtc_method | Multiple-testing correction method (passed to p.adjust). |
sort_results | Sort enrichment results from smallest to largest p-values. |
verbose | Print messages. |
A list containing three data frames:
results
: dataframe in which each row gives the statistics
(p-value, fold change and number of standard deviations from the mean)
associated with the enrichment of the stated cell type in the gene list
hit.cells
: vector containing the summed proportion of
expression in each cell type for the target list
bootstrap_data
: matrix in which each row represents the
summed proportion of expression in each cell type for one of the
random lists
#>#># Set the parameters for the analysis # Use 3 bootstrap lists for speed, for publishable analysis use >=10,000 reps <- 3 # Load gene list from Alzheimer's disease GWAS example_genelist <- ewceData::example_genelist()#>#># Bootstrap significance test, no control for transcript length or GC content full_results <- EWCE::bootstrap_enrichment_test( sct_data = ctd, hits = example_genelist, reps = reps, annotLevel = 1, sctSpecies = "mouse", genelistSpecies = "human" )#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#> #>#>#>#>#>#> #>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#> #>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#>#> #>