Convert a CellTypeDataset into standardized format

This function will take a CTD, drop all genes without 1:1 orthologs with the output_species ("human" by default), convert the remaining genes to gene symbols, assign names to each level, and convert all matrices to sparse matrices and/or DelayedArray.

standardise_ctd(
  ctd,
  dataset,
  input_species = NULL,
  output_species = "human",
  non121_strategy = "drop_both_species",
  force_new_quantiles = TRUE,
  remove_unlabeled_clusters = FALSE,
  numberOfBins = 40,
  keep_annot = TRUE,
  keep_plots = TRUE,
  as_sparse = TRUE,
  as_DelayedArray = FALSE,
  verbose = TRUE
)

Arguments

ctd	Input CellTypeData.
dataset	CellTypeData. name.
input_species	Which species the gene names in `exp` come from.
output_species	Which species' genes names to convert `exp` to.
non121_strategy	How to handle genes that don't have 1:1 mappings between `input_species`:`output_species`. Options include: `"drop_both_species" or "dbs" or 1` : Drop genes that have duplicate mappings in either the `input_species` or `output_species` (DEFAULT). `"drop_input_species" or "dis" or 2` : Only drop genes that have duplicate mappings in the `input_species`. `"drop_output_species" or "dos" or 3` : Only drop genes that have duplicate mappings in the `output_species`. `"keep_both_species" or "kbs" or 4` : Keep all genes regardless of whether they have duplicate mappings in either species. `"keep_popular" or "kp" or 5` : Return only the most "popular" interspecies ortholog mappings. This procedure tends to yield a greater number of returned genes but at the cost of many of them not being true biological 1:1 orthologs. `"sum","mean","median","min" or "max"` : When `gene_df` is a matrix and `gene_output="rownames"`, these options will aggregate many-to-one gene mappings (`input_species`-to-`output_species`) after dropping any duplicate genes in the `output_species`.
force_new_quantiles	By default, quantile computation is skipped if they have already been computed. Set `=TRUE` to override this and generate new quantiles.
remove_unlabeled_clusters	Remove any samples that have numeric column names.
numberOfBins	Number of non-zero quantile bins.
keep_annot	Keep the column annotation data if provided.
keep_plots	Keep the dendrograms if provided.
as_sparse	Convert to sparse matrix.
as_DelayedArray	Convert to `DelayedArray`.
verbose	Print messages.

Value

Standardised CellTypeDataset.

Examples

ctd <- ewceData::ctd()
#> see ?ewceData and browseVignettes('ewceData') for documentation
#> loading from cache
ctd_std <- standardise_ctd(
    ctd = ctd,
    input_species = "mouse",
    dataset = "Zeisel2016"
)
#> Standardising CellTypeDataset
#> Level: 1
#> Extracting  mean_exp
#> Converting to sparse matrix.
#> Matrix dimensions: 13243 x 7
#> Extracting  specificity
#> Converting to sparse matrix.
#> Matrix dimensions: 13243 x 7
#> Extracting  specificity_quantiles
#> Converting to sparse matrix.
#> Converting to sparse matrix.
#> Matrix dimensions: 13243 x 7
#> Level: 2
#> Extracting  mean_exp
#> Converting to sparse matrix.
#> Matrix dimensions: 13243 x 48
#> Extracting  specificity
#> Converting to sparse matrix.
#> Matrix dimensions: 13243 x 48
#> Extracting  specificity_quantiles
#> Converting to sparse matrix.
#> Converting to sparse matrix.
#> Matrix dimensions: 13243 x 48