This function will take a CTD, drop all genes without 1:1 orthologs with the output_species ("human" by default), convert the remaining genes to gene symbols, assign names to each level, and convert all matrices to sparse matrices and/or DelayedArray.

standardise_ctd(
  ctd,
  dataset,
  input_species = NULL,
  output_species = "human",
  non121_strategy = "drop_both_species",
  force_new_quantiles = TRUE,
  remove_unlabeled_clusters = FALSE,
  numberOfBins = 40,
  keep_annot = TRUE,
  keep_plots = TRUE,
  as_sparse = TRUE,
  as_DelayedArray = FALSE,
  verbose = TRUE
)

Arguments

ctd

Input CellTypeData.

dataset

CellTypeData. name.

input_species

Which species the gene names in exp come from.

output_species

Which species' genes names to convert exp to.

non121_strategy

How to handle genes that don't have 1:1 mappings between input_species:output_species. Options include:

  • "drop_both_species" or "dbs" or 1 :
    Drop genes that have duplicate mappings in either the input_species or output_species
    (DEFAULT).

  • "drop_input_species" or "dis" or 2 :
    Only drop genes that have duplicate mappings in the input_species.

  • "drop_output_species" or "dos" or 3 :
    Only drop genes that have duplicate mappings in the output_species.

  • "keep_both_species" or "kbs" or 4 :
    Keep all genes regardless of whether they have duplicate mappings in either species.

  • "keep_popular" or "kp" or 5 :
    Return only the most "popular" interspecies ortholog mappings. This procedure tends to yield a greater number of returned genes but at the cost of many of them not being true biological 1:1 orthologs.

  • "sum","mean","median","min" or "max" :
    When gene_df is a matrix and gene_output="rownames", these options will aggregate many-to-one gene mappings (input_species-to-output_species) after dropping any duplicate genes in the output_species.

force_new_quantiles

By default, quantile computation is skipped if they have already been computed. Set =TRUE to override this and generate new quantiles.

remove_unlabeled_clusters

Remove any samples that have numeric column names.

numberOfBins

Number of non-zero quantile bins.

keep_annot

Keep the column annotation data if provided.

keep_plots

Keep the dendrograms if provided.

as_sparse

Convert to sparse matrix.

as_DelayedArray

Convert to DelayedArray.

verbose

Print messages.

Value

Standardised CellTypeDataset.

Examples

ctd <- ewceData::ctd()
#> see ?ewceData and browseVignettes('ewceData') for documentation
#> loading from cache
ctd_std <- standardise_ctd( ctd = ctd, input_species = "mouse", dataset = "Zeisel2016" )
#> Standardising CellTypeDataset
#> Level: 1
#> Extracting mean_exp
#> Converting to sparse matrix.
#> Matrix dimensions: 13243 x 7
#> Extracting specificity
#> Converting to sparse matrix.
#> Matrix dimensions: 13243 x 7
#> Extracting specificity_quantiles
#> Converting to sparse matrix.
#> Converting to sparse matrix.
#> Matrix dimensions: 13243 x 7
#> Level: 2
#> Extracting mean_exp
#> Converting to sparse matrix.
#> Matrix dimensions: 13243 x 48
#> Extracting specificity
#> Converting to sparse matrix.
#> Matrix dimensions: 13243 x 48
#> Extracting specificity_quantiles
#> Converting to sparse matrix.
#> Converting to sparse matrix.
#> Matrix dimensions: 13243 x 48