oligo:GEO/ArrayExpress芯片数据处理
conda配置环境
- conda clean -y -all
- conda create -n geo -c conda-forge r-base=4.1.3
- conda activate geo
- conda install -c conda-forge r-tidyverse=1.3.1 -y
- conda install -c conda-forge r-irkernel -y
- conda install -c conda-forge r-r.utils -y
- conda install -c conda-forge r-rcolorbrewer -y
- Rscript -e “IRkernel::installspec(name=’geo’, displayname=’r-geo’)”
- conda install -c bioconda bioconductor-geoquery -y
- conda install -c bioconda bioconductor-oligo -y
- conda install -c bioconda bioconductor-affy -y
- conda install -c bioconda bioconductor-pd.hugene.1.0.st.v1 -y # read.celfiles 需要啥安装啥
- conda install -c bioconda bioconductor-pd.hg.u133a.2 -y
- conda install -c bioconda bioconductor-arrayexpress -y
- conda install -c bioconda bioconductor-biomart -y
- conda install -c bioconda bioconductor-affycoretools -y
下载芯片数据
GEO
1 |
|
ArrayExpress
- ~/dev/xray/xray -c ~/etc/xui2.json &
- wget -e “https_proxy=http://127.0.0.1:20809“ https://www.ebi.ac.uk/arrayexpress/files/E-MEXP-1422/E-MEXP-1422.raw.1.zip -O E-MEXP-1422.raw.1.zip
- unzip E-MEXP-1422.raw.1.zip
1 |
|
RMA标准化数据
1 |
|
探针转Symbol
- f_dedup_IQR 函数来源
- data.raw@annotation 可以得到注释的包信息,如pd.hg.u133a.2
- 在bioconductor上搜索pd.hg.u133a.2,可以知道其Platform为HG-U133A_2
- 谷歌搜索 HG-U133A_2 + affymetrix,可以找到affymetrix官网对应的页面
- 谷歌搜索 affymetrix “HG-U133A_2” site:www.ncbi.nlm.nih.gov,随便进一个GSE页面,找到对应的Platforms,如 GPL571 [HG-U133A_2] Affymetrix Human Genome U133A 2.0 Array,说明HG-U133A_2对应GPL571
1 |
|
GPL与Platform对应表
gpl
bioc_package
title
GPL32
mgu74a
[MG_U74A] Affymetrix Murine Genome U74A Array
GPL33
mgu74b
[MG_U74B] Affymetrix Murine Genome U74B Array
GPL34
mgu74c
[MG_U74C] Affymetrix Murine Genome U74C Array
GPL71
ag
[AG] Affymetrix Arabidopsis Genome Array
GPL72
drosgenome1
[DrosGenome1] Affymetrix Drosophila Genome Array
GPL74
hcg110
[HC_G110] Affymetrix Human Cancer Array
GPL75
mu11ksuba
[Mu11KsubA] Affymetrix Murine 11K SubA Array
GPL76
mu11ksubb
[Mu11KsubB] Affymetrix Murine 11K SubB Array
GPL77
mu19ksuba
[Mu19KsubA] Affymetrix Murine 19K SubA Array
GPL78
mu19ksubb
[Mu19KsubB] Affymetrix Murine 19K SubB Array
GPL79
mu19ksubc
[Mu19KsubC] Affymetrix Murine 19K SubC Array
GPL80
hu6800
[Hu6800] Affymetrix Human Full Length HuGeneFL Array
GPL81
mgu74av2
[MG_U74Av2] Affymetrix Murine Genome U74A Version 2 Array
GPL82
mgu74bv2
[MG_U74Bv2] Affymetrix Murine Genome U74B Version 2 Array
GPL83
mgu74cv2
[MG_U74Cv2] Affymetrix Murine Genome U74 Version 2 Array
GPL85
rgu34a
[RG_U34A] Affymetrix Rat Genome U34 Array
GPL86
rgu34b
[RG_U34B] Affymetrix Rat Genome U34 Array
GPL87
rgu34c
[RG_U34C] Affymetrix Rat Genome U34 Array
GPL88
rnu34
[RN_U34] Affymetrix Rat Neurobiology U34 Array
GPL89
rtu34
[RT_U34] Affymetrix Rat Toxicology U34 Array
GPL90
ygs98
[YG_S98] Affymetrix Yeast Genome S98 Array
GPL91
hgu95av2
[HG_U95A] Affymetrix Human Genome U95A Array
GPL92
hgu95b
[HG_U95B] Affymetrix Human Genome U95B Array
GPL93
hgu95c
[HG_U95C] Affymetrix Human Genome U95C Array
GPL94
hgu95d
[HG_U95D] Affymetrix Human Genome U95D Array
GPL95
hgu95e
[HG_U95E] Affymetrix Human Genome U95E Array
GPL96
hgu133a
[HG-U133A] Affymetrix Human Genome U133A Array
GPL97
hgu133b
[HG-U133B] Affymetrix Human Genome U133B Array
GPL98
hu35ksuba
[Hu35KsubA] Affymetrix Human 35K SubA Array
GPL99
hu35ksubb
[Hu35KsubB] Affymetrix Human 35K SubB Array
GPL100
hu35ksubc
[Hu35KsubC] Affymetrix Human 35K SubC Array
GPL101
hu35ksubd
[Hu35KsubD] Affymetrix Human 35K SubD Array
GPL198
ath1121501
[ATH1-121501] Affymetrix Arabidopsis ATH1 Genome Array
GPL199
ecoli2
[Ecoli_ASv2] Affymetrix E. coli Antisense Genome Array
GPL200
celegans
[Celegans] Affymetrix C. elegans Genome Array
GPL201
hgfocus
[HG-Focus] Affymetrix Human HG-Focus Target Array
GPL339
moe430a
[MOE430A] Affymetrix Mouse Expression 430A Array
GPL340
mouse4302
[MOE430B] Affymetrix Mouse Expression 430B Array
GPL341
rae230a
[RAE230A] Affymetrix Rat Expression 230A Array
GPL342
rae230b
[RAE230B] Affymetrix Rat Expression 230B Array
GPL570
hgu133plus2
[HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array
GPL571
hgu133a2
[HG-U133A_2] Affymetrix Human Genome U133A 2.0 Array
GPL886
hgug4111a
Agilent-011871 Human 1B Microarray G4111A (Feature Number version)
GPL887
hgug4110b
Agilent-012097 Human 1A Microarray (V2) G4110B (Feature Number version)
GPL1261
mouse430a2
[Mouse430_2] Affymetrix Mouse Genome 430 2.0 Array
GPL1318
xenopuslaevis
[Xenopus_laevis] Affymetrix Xenopus laevis Genome Array
GPL1319
zebrafish
[Zebrafish] Affymetrix Zebrafish Genome Array
GPL1322
drosophila2
[Drosophila_2] Affymetrix Drosophila Genome 2.0 Array
GPL1352
u133x3p
[U133_X3P] Affymetrix Human X3P Array
GPL1355
rat2302
[Rat230_2] Affymetrix Rat Genome 230 2.0 Array
GPL1708
hgug4112a
Agilent-012391 Whole Human Genome Oligo Microarray G4112A (Feature Number version)
GPL2112
bovine
[Bovine] Affymetrix Bovine Genome Array
GPL2529
yeast2
[Yeast_2] Affymetrix Yeast Genome 2.0 Array
GPL2891
h20kcod
GE Healthcare/Amersham Biosciences CodeLink™ UniSet Human 20K I Bioarray
GPL2898
adme16cod
GE Healthcare/Amersham Biosciences CodeLink™ ADME Rat 16-Assay Bioarray
GPL3154
ecoli2
[E_coli_2] Affymetrix E. coli Genome 2.0 Array
GPL3213
chicken
[Chicken] Affymetrix Chicken Genome Array
GPL3533
porcine
[Porcine] Affymetrix Porcine Genome Array
GPL3738
canine2
[Canine_2] Affymetrix Canine Genome 2.0 Array
GPL3921
hthgu133a
[HT_HG-U133A] Affymetrix HT Human Genome U133A Array
GPL3979
canine
[Canine] Affymetrix Canine Genome 1.0 Array
GPL4032
[Maize] Affymetrix Maize Genome Array
GPL4191
h10kcod
CodeLink UniSet Human I Bioarray
GPL5188
huex10sttranscriptcluster
[HuEx-1_0-st] Affymetrix Human Exon 1.0 ST Array [probe set (exon) version]
GPL5689
hgug4100a
Agilent Human 1 cDNA Microarray (G4100A) [layout C]
GPL6097
illuminaHumanv1
Illumina human-6 v1.0 expression beadchip
GPL6102
illuminaHumanv2
Illumina human-6 v2.0 expression beadchip
GPL6244
hugene10sttranscriptcluster
[HuGene-1_0-st] Affymetrix Human Gene 1.0 ST Array [transcript (gene) version]
GPL6246
mogene10sttranscriptcluster
[MoGene-1_0-st] Affymetrix Mouse Gene 1.0 ST Array [transcript (gene) version]
GPL6885
illuminaMousev2
Illumina MouseRef-8 v2.0 expression beadchip
GPL6947
illuminaHumanv3
Illumina HumanHT-12 V3.0 expression beadchip
GPL8300
hgu95av2
[HG_U95Av2] Affymetrix Human Genome U95 Version 2 Array
GPL8321
mouse430a2
[Mouse430A_2] Affymetrix Mouse Genome 430A 2.0 Array
GPL8490
IlluminaHumanMethylation27k
Illumina HumanMethylation27 BeadChip (HumanMethylation27_270596_v.1.2)
GPL10558
illuminaHumanv4
Illumina HumanHT-12 V4.0 expression beadchip
GPL11532
hugene11sttranscriptcluster
[HuGene-1_1-st] Affymetrix Human Gene 1.1 ST Array [transcript (gene) version]
GPL13497
HsAgilentDesign026652
Agilent-026652 Whole Human Genome Microarray 4x44K v2 (Probe Name version)
GPL13534
IlluminaHumanMethylation450k
Illumina HumanMethylation450 BeadChip (HumanMethylation450_15017482)
GPL13667
hgu219
[HG-U219] Affymetrix Human Genome U219 Array
GPL14877
hgu133plus2
Affymetrix Human Genome U133 Plus 2.0 Array [Brainarray Version 13, HGU133Plus2_Hs_ENTREZG]
GPL15380
GGHumanMethCancerPanelv1
Illumina Sentrix Array Matrix (SAM) - GoldenGate Methylation Cancer Panel I
GPL15396
hthgu133b
[HT_HG-U133B] Affymetrix HT Human Genome U133B Array [custom CDF: ENTREZ brainarray v. 14]
GPL17556
hugene10sttranscriptcluster
[HuGene-1_0-st] Affymetrix Human Gene 1.0 ST Array [HuGene10stv1_Hs_ENTREZG_17.0.0]
GPL17897
hthgu133a
[HT_HG-U133A] Affymetrix Human Genome U133A Array (custom CDF: HTHGU133A_Hs_ENTREZG.cdf version 17.0.0)
GPL18190
hugene11sttranscriptcluster
[HuGene-1_1-st] Affymetrix Human Gene 1.1 ST Array [CDF: Brainarray HuGene11stv1_Hs_ENTREZG_15.1.0]
来自:爱代码爱编程