generate_celltype_data takes gene expression data and cell type annotations and creates CellTypeData (CTD) files which contain matrices of mean expression and specificity per cell type.

generate_celltype_data(
  exp,
  annotLevels,
  groupName,
  no_cores = 1,
  savePath = tempdir(),
  file_prefix = "ctd",
  as_sparse = TRUE,
  as_DelayedArray = FALSE,
  normSpec = FALSE,
  convert_orths = FALSE,
  input_species = "mouse",
  output_species = "human",
  non121_strategy = "drop_both_species",
  force_new_file = TRUE,
  specificity_quantiles = TRUE,
  numberOfBins = 40,
  dendrograms = TRUE,
  return_ctd = FALSE,
  verbose = TRUE,
  ...
)

Arguments

exp

Numerical matrix with row for each gene and column for each cell. Row names are gene symbols. Column names are cell IDs which can be cross referenced against the annot data frame.

annotLevels

List with arrays of strings containing the cell type names associated with each column in exp.

groupName

A human readable name for referring to the dataset being

no_cores

Number of cores that should be used to speedup the computation. NOTE: Use no_cores=1 when using this package in windows system.

savePath

Directory where the CTD file should be saved.

file_prefix

Prefix to add to saved CTD file name.

as_sparse

Convert exp to a sparse Matrix.

as_DelayedArray

Convert exp to DelayedArray.

normSpec

Boolean indicating whether specificity data should be transformed to a normal distribution by cell type, giving equivalent scores across all cell types.

convert_orths

If input_species!=output_species and convert_orths=TRUE, will drop genes without 1:1 output_species orthologs and then convert exp gene names to those of output_species.

input_species

The species that the exp dataset comes from.

output_species

Species to convert exp to (Default: "human").

non121_strategy

How to handle genes that don't have 1:1 mappings between input_species:output_species. Options include:

  • "drop_both_species" or "dbs" or 1 :
    Drop genes that have duplicate mappings in either the input_species or output_species
    (DEFAULT).

  • "drop_input_species" or "dis" or 2 :
    Only drop genes that have duplicate mappings in the input_species.

  • "drop_output_species" or "dos" or 3 :
    Only drop genes that have duplicate mappings in the output_species.

  • "keep_both_species" or "kbs" or 4 :
    Keep all genes regardless of whether they have duplicate mappings in either species.

  • "keep_popular" or "kp" or 5 :
    Return only the most "popular" interspecies ortholog mappings. This procedure tends to yield a greater number of returned genes but at the cost of many of them not being true biological 1:1 orthologs.

  • "sum","mean","median","min" or "max" :
    When gene_df is a matrix and gene_output="rownames", these options will aggregate many-to-one gene mappings (input_species-to-output_species) after dropping any duplicate genes in the output_species.

force_new_file

If a file of the same name as the one being created already exists, overwrite it.

specificity_quantiles

Compute specificity quantiles. Recommended to set to TRUE.

numberOfBins

Number of quantile 'bins' to use (40 is recommended)

dendrograms

Add dendrogram plots

return_ctd

Return the CTD object in a list along with the file name, instead of just the file name.

verbose

Print messages.

...

Additional arguments passed to convert_orthologs.

Value

File names for the saved CellTypeData (CTD) files.

Examples

# Load the single cell data cortex_mrna <- ewceData::cortex_mrna()
#> see ?ewceData and browseVignettes('ewceData') for documentation
#> loading from cache
# Use only a subset to keep the example quick expData <- cortex_mrna$exp[1:100, ] l1 <- cortex_mrna$annot$level1class l2 <- cortex_mrna$annot$level2class annotLevels <- list(l1 = l1, l2 = l2) fNames_ALLCELLS <- EWCE::generate_celltype_data( exp = expData, annotLevels = annotLevels, groupName = "allKImouse" )
#> + 1 core(s) assigned as workers ( 63 reserved).
#> Converting to sparse matrix.
#> + Calculating normalized mean expression.
#> Converting to sparse matrix.
#> Converting to sparse matrix.
#> + Calculating normalized specificity.
#> Converting to sparse matrix.
#> Converting to sparse matrix.
#> Converting to sparse matrix.
#> Converting to sparse matrix.
#> Loading required namespace: ggdendro
#> + Saving results ==> /tmp/Rtmp5DLuZU/ctd_allKImouse.rda