R/generate_bootstrap_plots_for_transcriptome.r
generate_bootstrap_plots_for_transcriptome.Rd
generate_bootstrap_plots_for_transcriptome
takes a genelist and a
single cell type transcriptome dataset and generates plots which show how
the expression of the genes in the list compares to those in randomly
generated gene lists
generate_bootstrap_plots_for_transcriptome( sct_data, tt, thresh = 250, annotLevel = 1, reps, full_results = NA, listFileName = "", showGNameThresh = 25, ttSpecies = "mouse", sctSpecies = "mouse", sortBy = "t", onlySignif = TRUE, savePath = tempdir() )
sct_data | List generated using generate_celltype_data. |
---|---|
tt | Differential expression table. Can be output of topTable function. Minimum requirement is that one column stores a metric of increased/decreased expression (i.e. log fold change, t-statistic for differential expression etc) and another contains gene symbols. |
thresh | The number of up- and down- regulated genes to be included in each analysis (Default: 250) |
annotLevel | an integer indicating which level of the annotation to analyse (Default: 1). |
reps | Number of random gene lists to generate (Default: 100, but should be >=10,000 for publication-quality results). |
full_results | The full output of ewce_expression_data for the same gene list. |
listFileName | String used as the root for files saved using this function. |
showGNameThresh | Integer. If a gene has over X percent of it's expression proportion in a cell type, then list the gene name |
ttSpecies | Either 'mouse' or 'human' depending on which species the differential expression table was generated from |
sctSpecies | Either 'mouse' or 'human' depending on which species the single cell data was generated from. |
sortBy | Column name of metric in tt which should be used to sort up- from down- regulated genes (Default: "t"). |
onlySignif | Should plots only be generated for cells which have significant changes? |
savePath | Directory where the BootstrapPlots folder should be saved, default is a temp directory. |
Saves a set of PDF files containing graphs and returns the file where
they are saved. These will be saved with the filename adjusted using the
value of listFileName.
The files are saved into the
BootstrapPlot folder.
Files start with one of the following:
qqplot_noText
: sorts the gene list according to how enriched
it is in the relevant cell type. Plots the value in the target list against
the mean value in the bootstrapped lists.
qqplot_wtGSym
: as above but labels the gene symbols for the
highest expressed genes.
bootDists
: rather than just showing the mean of the
bootstrapped lists, a boxplot shows the distribution of values
bootDists_LOG
: shows the bootstrapped distributions with the
y-axis shown on a log scale
if (FALSE) { ## Load the single cell data ctd <- ewceData::ctd() ## Set the parameters for the analysis ## Use 3 bootstrap lists for speed, for publishable analysis use >10,000 reps <- 3 annotLevel <- 1 # <- Use cell level annotations (i.e. Interneurons) ## Use 5 up/down regulated genes (thresh) for speed, default is 250 thresh <- 5 ## Load the top table tt_alzh <- ewceData::tt_alzh() tt_results <- EWCE::ewce_expression_data( sct_data = ctd, tt = tt_alzh, annotLevel = 1, thresh = thresh, reps = reps, ttSpecies = "human", sctSpecies = "mouse" ) ## Bootstrap significance test, ## no control for transcript length or GC content full_results <- EWCE::generate_bootstrap_plots_for_transcriptome( sct_data = ctd, tt = tt_alzh, thresh = thresh, annotLevel = 1, full_results = tt_results, listFileName = "examples", reps = reps, ttSpecies = "human", sctSpecies = "mouse", savePath = tempdir() ) }