【佳學(xué)基因檢測】RNA測序結(jié)果分析起點數(shù)據(jù)標準
RNA測序數(shù)據(jù)分析導(dǎo)讀:
出于推廣基因信息技術(shù)的目的,在這里,佳學(xué)基因所推出的數(shù)據(jù)分析和操作標準都可以采用共享程序、開源程序可以完成的。RNA測序分析開源程序大多數(shù)可以從Bioconductor軟件中找到,從而支持從端到端的基因水平的RNA測序數(shù)據(jù)中的基因差異差異表達分析。佳學(xué)基因從FASTQ文件開始,展示這些文件是如何與人類基因組的參考基因組對齊,并生成一個計數(shù)矩陣。從該矩陣統(tǒng)計每個樣本每個基因內(nèi)RNA測序數(shù)據(jù)中的測序數(shù)據(jù)、表達片段。佳學(xué)基因應(yīng)導(dǎo)大家進行探索性數(shù)據(jù)分析(EDA),從而對數(shù)據(jù)質(zhì)量進行質(zhì)量評估,并探索樣本之間的關(guān)系,執(zhí)行差異基因表達分析,并生成可用于高索引文章發(fā)表的圖表。
RNA測序數(shù)據(jù)開源分析軟件介紹
佳學(xué)基因是國際開源軟件聯(lián)盟成員。成員軟件庫Bioconductor有許多支持高通量序列數(shù)據(jù)分析的軟件包,包括RNA測序(RNA seq)。佳學(xué)基因在展示、示范過程中使用的軟件包包括由Bioconductor核心團隊維護的核心軟件包,用于導(dǎo)入和處理原始測序數(shù)據(jù)以及對RNA測序數(shù)據(jù)進行基因注釋。其中的部分軟件包可以進行部分統(tǒng)計分析和序列數(shù)據(jù)圖表的生成。Bioconductor按計劃每6個月進行一次更新,從而確保項目中的所有軟件包能夠協(xié)調(diào)一致地工作。此工作流中使用的軟件包帶有庫功能,可以按照Bioconductor軟件包安裝說明進行安裝。
RNA測序數(shù)據(jù)分析時的起點數(shù)據(jù)
該工作流程中使用的數(shù)據(jù)存儲來源于真實的實驗數(shù)據(jù)。實驗中的氣道平滑肌細胞使用地塞米松(一種具有抗炎作用的合成糖皮質(zhì)激素類固醇)進行處理。在現(xiàn)實生活中,哮喘患者使用糖皮質(zhì)激素來減輕氣道炎癥。在實驗中,四個原代人氣道平滑肌細胞系用1微摩爾地塞米松處理18小時。對于四個細胞系中的每一個,有一個實驗樣本和一個空白對照樣本。原代ASM細胞是從四名無慢性疾病的流產(chǎn)肺移植供體中分離出來的。第4至7代ASM細胞維持在添加10%FBS的Hams F12培養(yǎng)基中,用于所有實驗。對于RNA Seq和qRT PCR驗證實驗,來自每個供體的細胞用1µM DEX或空白對照溶液處理18小時。
Preliminary processing of raw reads was performed using Casava 1.8 (Illumina, Inc., San Diego, CA). Subsequently, Taffeta scripts (https://github.com/blancahimes/taffeta) were used to analyze RNA-Seq data, which included use of FastQC [54] (v.0.10.0) to obtain overall QC metrics. Based on having sequence bias in the initial 12 bases on the 5′ end of reads, the first 12 bases of all reads were trimmed with the FASTX Toolkit (v.0.0.13) [55]. FastQC reports for each sample revealed that each was successfully sequenced. Trimmed reads for each sample were aligned to the reference hg19 genome and known ERCC transcripts using TopHat [56] (v.2.0.4), while constraining mapped reads to be within reference hg19 or ERCC transcripts. Additional QC parameters were obtained to assess whether reads were appropriately mapped. Bamtools [57] was used to the number of mapped reads, including junction spanning reads. The Picard Tools (http://picard.sourceforge.net) RnaSeqMetrics function was used to compute the number of bases assigned to various classes of RNA, according to the hg19 refFlat file available as a UCSC Genome Table. For each sample, Cufflinks [21] (v.2.0.2) was used to quantify ERCC Spike-In and hg19 transcripts based on reads that mapped to the provided hg19 and ERCC reference files. For three samples that contained ERCC Spike-Ins, we created dose response curves (i.e. plots of ERCC transcript FPKM vs. ERCC transcript molecules) following the manufacturer's protocol [58]. Ideally, the slope and R2 would equal 1.0. For our samples (Dex.2, Control.4, Dex.4), the slope (R2) values were 0.90 (0.90), 0.92 (0.84), 0.82 (0.86), respectively. Raw read plots were created by displaying bigwig files for each sample in the UCSC Genome Browser.
Differential expression of genes and transcripts in samples treated with DEX vs. untreated samples was obtained using Cuffdiff [21] (v.2.0.2) with the quantified transcripts computed by Cufflinks (v.2.0.2), while applying bias correction. The CummeRbund [59] R package (v.0.1.3) was used to measure significance of differentially expressed genes and create plots of the results. As a positive control of gene expression, the FPKM values for four housekeeping genes (i.e., B2M, GABARAP, GAPDH, RPL19) were obtained. Each had high FPKM values that did not differ significantly by treatment status [Figure S11]. The NIH Database for Annotation, Visualization and Integrated Discovery (DAVID) was used to perform gene functional annotation clustering using Homo Sapiens as background, and default options and annotation categories (Disease: OMIM_DISEASE; Functional Categories: COG_ONTOLOGY, SP_PIR_KEYWORDS, UP_SEQ_FEATURE; Gene_Ontology: GOTERM_BP_FAT, GOTERM_CC_FAT, GOTERM_MF_FAT; Pathway: BBID, BIOCARTA, KEGG_PATHWAY; Protein_Domains: INTERPRO, PIR_SUPERFAMILY, SMART) [28]. The RNA-Seq data is available at the Gene Expression Omnibus Web site (http://www.ncbi.nlm.nih.gov/geo/) under accession GSE52778.
- 【佳學(xué)基因檢測】什么是MLPA基因檢測?有什么優(yōu)點?...
- 【佳學(xué)基因檢測】如何將全基因組測序(WGS)基因檢測數(shù)據(jù)定位到人的標準基因組上?...
- 【佳學(xué)基因檢測】FISH基因檢測中的探針類型選擇...
- 【佳學(xué)基因檢測】腫瘤基因檢測生物信息分析注意事項...
- 【佳學(xué)基因檢測】癌癥基因組檢測要點:一定要知道!...
- 【佳學(xué)基因檢測】什么是基因組檢測?...
- 【佳學(xué)基因檢測】TP53突變基因檢測...
- 【佳學(xué)基因檢測】基因解碼對Y染色體的進一步解密...
- 【佳學(xué)基因檢測】腫瘤基因檢測需要包括重復(fù)或反復(fù)區(qū)域的分析嗎?...
- 【佳學(xué)基因檢測】如何采用液體活檢檢進行細胞學(xué)檢測與NGS測序...
- 【佳學(xué)基因檢測】臨床科研服務(wù):GWAS課題中的統(tǒng)計分析...
- 【佳學(xué)基因檢測】腫瘤靶向藥物Regorafenib (Stivarga) 及其在結(jié)直腸癌治療中的作用...
- 【佳學(xué)基因檢測】ALDOA的群體遺傳學(xué)結(jié)果對基因檢測正確性的影響...
- 【佳學(xué)基因檢測】SLC25A4的雙生子遺傳學(xué)分析結(jié)果簡介...
- 【佳學(xué)基因檢測】ASIC1的分子遺傳學(xué)分析成果...
- 【佳學(xué)基因檢測】ANXA6分子病理學(xué)成果概要...
- 【佳學(xué)基因檢測】檢驗科醫(yī)師晉升考試關(guān)于ADRA2C的知識...
- 【佳學(xué)基因檢測】醫(yī)學(xué)院碩士研究考試關(guān)于ACVR2A基因檢測的知識要點...
- 【佳學(xué)基因檢測】醫(yī)學(xué)博士ANK1基因檢測的知識結(jié)構(gòu)準備...
- 【佳學(xué)基因檢測】醫(yī)學(xué)院專升本關(guān)于ADCYAP1R1基因檢測的基本技能...
- 【佳學(xué)基因檢測】病例分析會中需要知道的關(guān)于ACLY基因的知識...
- 【佳學(xué)基因檢測】病案討論中需要知道的關(guān)于AIF1的知識...
- 【佳學(xué)基因檢測】質(zhì)譜基因檢測AGTR2基因存在基因突變該怎么理解?...
- 【佳學(xué)基因檢測】飛行質(zhì)譜基因檢測發(fā)現(xiàn)ADRA2A有突變,嚴重嗎?...
- 【佳學(xué)基因檢測】核型分析發(fā)現(xiàn)NAT1突變了,是什么意思?...
- 【佳學(xué)基因檢測】遺傳學(xué)檢測結(jié)果指出ALOX15突變,該找誰咨詢?...
- 【佳學(xué)基因檢測】高精度基因檢測為什么包含ADD1基因?...
- 【佳學(xué)基因檢測】基因檢測包中為什么一定要有ACTA2基因?...
- 【佳學(xué)基因檢測】基因檢測時查看是否包含ADH1C重要嗎?...
- 【佳學(xué)基因檢測】NR0B1基因間序列存在突變是否需要阻斷遺傳?...
- 來了,就說兩句!
-
- 賊新評論 進入詳細評論頁>>