引用程序和文献

重测序分析模块,均源于该领域公开发表文献中,广受认可的算法和开源程序。

详细信息,参考软件内置模块帮助文档,或点击在线帮助文档,进行查看。

简介如下所示:


1、原始数据碱基分布图和质量分布图

FASTX-Toolkit v0.0.13(http://hannonlab.cshl.edu/fastx_toolkit/)


2、原始数据质控

Trimmomatic v0.32 ( http://www.usadellab.org/cms/index.php?page=trimmomatic )

Bolger, A.M., Lohse, M., & Usadel, B. (2014). Trimmomatic: A flexible trimmer for Illumina Sequence Data. Bioinformatics, btu170.


3、序列比对

BWA v0.7.12http://bio-bwa.sourceforge.net/

Li H. and Durbin R. (2009) Fast and accurate short read alignment with Burrows-Wheeler Transform. Bioinformatics, 25:1754-60. [PMID: 19451168]

Li H. and Durbin R. (2010) Fast and accurate long-read alignment with Burrows-Wheeler Transform. Bioinformatics, Epub. [PMID: 20080505]


4、SAM/BAM文件处理

SAMtools v0.1.19http://samtools.sourceforge.net/

Li H.*, Handsaker B.*, Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R. and 1000 Genome Project Data Processing Subgroup (2009) The Sequence alignment/map (SAM) format and SAMtools. Bioinformatics, 25, 2078-9. [PMID: 19505943]

Li H A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics. 2011 Nov 1;27(21):2987-93. Epub 2011 Sep 8. [PMID: 21903627]


5、 BED区间文件处理

bedtools v2-2.20.1http://bedtools.readthedocs.io/en/latest/

Quinlan AR and Hall IM, 2010. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 26, 6, pp. 841–842.


6、变异检测

GATK v3.5-g36282e4https://software.broadinstitute.org/gatk/

The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA, 2010 GENOME RESEARCH 20:1297-303

A framework for variation discovery and genotyping using next-generation DNA sequencing data DePristo M, Banks E, Poplin R, Garimella K, Maguire J, Hartl C, Philippakis A, del Angel G, Rivas MA, Hanna M, McKenna A, Fennell T, Kernytsky A, Sivachenko A, Cibulskis K, Gabriel S, Altshuler D, Daly M, 2011 NATURE GENETICS 43:491-498

From FastQ Data to High-Confidence Variant Calls: The Genome Analysis Toolkit Best Practices Pipeline Van der Auwera GA, Carneiro M, Hartl C, Poplin R, del Angel G, Levy-Moonshine A, Jordan T, Shakir K, Roazen D, Thibault J, Banks E, Garimella K, Altshuler D, Gabriel S, DePristo M, 2013 CURRENT PROTOCOLS IN BIOINFORMATICS 43:11.10.1-11.10.33


7、变异检测结果注释

ANNOVAR v2016Feb01http://annovar.openbioinformatics.org/en/latest/

Wang K, Li M, Hakonarson H. ANNOVAR: Functional annotation of genetic variants from next-generation sequencing data Nucleic Acids Research, 38:e164, 2010

Chang X, Wang K. wANNOVAR: annotating genetic variants for personal genomes via the web Journal of Medical Genetics, 49:433-436, 2012

Yang H, Wang K. Genomic variant annotation and prioritization with ANNOVAR and wANNOVAR Nature Protocols, 10:1556-1566, 2015


8、SNP-Index计算和分析

R语言(v3.2.1)进行数据处理,方法参考自文献:

Takagi H, Abe A, Yoshida K, et al. QTL‐seq: rapid mapping ofquantitative trait loci in rice by whole genome resequencing of DNA from twobulked populations[J]. The Plant Journal, 2013, 74(1): 174-183.

Abe A, Kosugi S, Yoshida K, et al. Genome sequencing revealsagronomically important loci in rice using MutMap[J]. Nature biotechnology,2012, 30(2): 174-178.


9、SNP-Index分布图

R语言(v3.2.1)ggplot2包(v2.2.1)中的相关功能进行SNP-Index分布图的绘制。