Given an expression matrix, wherein the rows are supposed to be HGNC symbols, find those symbols which are not official HGNC symbols, then correct them if possible. Return the expression matrix with corrected symbols.
fix_bad_hgnc_symbols( exp, dropNonHGNC = FALSE, as_sparse = TRUE, verbose = TRUE )
exp | An expression matrix where the rows are HGNC symbols or a SingleCellExperiment (SCE) or other Ranged Summarized Experiment (SE) type object. |
---|---|
dropNonHGNC | Boolean. Should symbols not recognised as HGNC symbols be dropped? |
as_sparse | Convert |
verbose | Print messages. |
Returns the expression matrix with the rownames corrected and rows representing the same gene merged. If a SingleCellExperiment (SCE) or other Ranged Summarized Experiment (SE) type object was inputted this will be returned with the corrected expression matrix under counts.
# create example expression matrix, could be part of a exp, annot list obj exp <- matrix(data = runif(70), ncol = 10) # Add HGNC gene names but add with an error: # MARCH8 is a HGNC symbol which if opened in excel will convert to Mar-08 rownames(exp) <- c("MT-TF", "MT-RNR1", "MT-TV", "MT-RNR2", "MT-TL1", "MT-ND1", "Mar-08") exp <- fix_bad_hgnc_symbols(exp)#>#>#>#>#> Warning: Possible corruption of gene names by excel: Mar-08#>#> Warning: Human gene symbols should be all upper-case except for the 'orf' in open reading frames. The case of some letters was corrected.#> Warning: x contains non-approved gene symbols#>#> Warning: Human gene symbols should be all upper-case except for the 'orf' in open reading frames. The case of some letters was corrected.#> Warning: x contains non-approved gene symbols#>#>#># fix_bad_hgnc_symbols warns the user of this possible issue