构造Pseudo-bulk的表达量矩阵加速SingleR运行

发布于 2022-10-04  51 次阅读


The Human Cell Atlas 和 CZ CELLxGENE 的单细胞数据集有些metadata里有细胞类型注释。我们使用前面下载的数据集来构建一个SingleR的参考集。

sce <- readRDS('~/HumanCellAtlas/ProstateCellAtlas/cellxgene_Human_prostate.rds')
table(sce$tissue)
sce <- SeuratObject::UpdateSeuratObject(sce)
saveRDS([email protected], 'meta.rds')
[email protected][email protected]
umi <- [email protected][email protected]
sce <- Seurat::CreateSeuratObject(counts = umi, project = 'prostate', min.cells = 3, min.features = 200)
[email protected] <- readRDS('meta.rds')
all(colnames(sce) == rownames([email protected]))
sce <- subset(sce, tissue == 'prostate gland')
gc()
table(sce$`Broad cell type`)
table(sce$`Granular cell type`)
table(sce$`Granular cell type`)
table(sce$`Tissue composition`)
table(sce$`Cell types level 2`)
table(sce$`Cell types level 3`)
sce$cell_t_1 <- droplevels(sce$`Tissue composition`)
all(as.character(sce$cell_t_1) == as.character(sce$`Tissue composition`))
sce$cell_t_2 <- droplevels(sce$`Cell types level 2`)
sce$cell_t_3 <- droplevels(sce$`Cell types level 3`)
sce <-  Seurat::NormalizeData(sce)
ref <- list()
ref$cell_t_1 <- Seurat::AverageExpression(sce,
                          group.by = "cell_t_1",
                          assays = "RNA")$RNA
ref$cell_t_2 <- Seurat::AverageExpression(sce,
                          group.by = "cell_t_2",
                          assays = "RNA")$RNA
ref$cell_t_3 <- Seurat::AverageExpression(sce,
                          group.by = "cell_t_3",
                          assays = "RNA")$RNA
saveRDS(ref, 'singleR_prostate.rds')

一枚爱好探索的医学生