Applied Biosystems, a division of Life Technologies, and scientists from the Baylor College of Medicine Human Genome Sequencing Center (HGSC), have summarised their collective contribution to the first data release of the 1000 Genomes Project. As a commercial participant, and as part of Applied Biosystems’ collaboration with the HGSC, more than 460 gigabases of unique mappable sequence data has been generated by the SOLiD™ System, representing 65 % more than the target for the two organisations when the collaborative project was conceived.
Applied Biosystems and Baylor College of Medicine Summarise
The HGSC and Applied Biosystems are key participants in the 1000 Genomes Project, an international research consortium that aims to sequence the genomes of approximately 1,000 people from around the world to create the most detailed and medically useful compendium of human genetic variation. Organisations that have committed major support to the project include the National Human Genome Research Institute (NHGRI), the Wellcome Trust Sanger Institute, the Beijing Genomics Institute in Shenzhen, and the German Federal Ministry of Education and Research. The project will produce a DNA sequence-based genetic map that will enable researchers to better identify disease-related genomic variation, and accelerate efforts to use this information for the development of new strategies for diagnosing, treating and preventing common diseases.
In April 2008, the HGSC established a collaboration agreement with Applied Biosystems to use six SOLiD Systems to expand its contribution to the pilot phase of the project, and help
researchers to determine the best approach for accomplishing its goals. The HGSC’s target for contribution to the project using the SOLiD System was 200 gigabases of sequence data. In June 2008, Applied Biosystems joined the 1000 Genomes Project as a commercial participant, and committed to contribute a minimum of 75 gigabases of sequence data using the SOLiD System.
As a result of the pilot phase of the project, use of the SOLiD System for the collaboration between the HGSC and Applied Biosystems produced the following results:
- The HGSC collected sequence data from 25 genomes, generating 256 gigabases of uniquely mappable sequence data;
- The HGSC sequenced 24 individuals at approximately 2.6-fold coverage, and sequenced one individual at 26-fold coverage;
- By the end of 2008, the HGSC was generating an average of 15 gigabases per mate pair sequencing run using the SOLiD 2.0 System.
“One of the reasons we chose the SOLiD System for this research was the ability to generate mate-pair reads with inserts from one to three kilobases, which provides very good placement of the reads and allows us to determine genetic structural variation such as inversions, translocations, insertions, and deletions in complex genomes,” said Donna Muzny, Director of Operations at the HGSC.
Applied Biosystems’ commercial participation in the 1000 Genomes Project has enabled scientists to assess technology performance on a diverse set of biological samples. To date, the SOLiD System has contributed the following to the project:
- More than 206 gigabases of uniquely mappable sequence data, representing two-fold more data than the original committed target of 75 gigabases.
- These 206 gigabases averaged 17 gigabases per sequencing run, which was generated on the SOLiD 2.0 System. The throughput per run represents approximately five-fold coverage of the human genome.
- In the first data release from the 1000 Genome Project, the SNP data from an anonymous African sample was supported by data from the SOLiD 2.0 System.
- In the first data release of the 1000 Genomes Project, the small insertions and deletions (indels) data from an anonymous African sample was generated by the SOLiD 2.0 System.
“Applied Biosystems’ participation with the SOLiD System in the 1000 Genomes Project consortium has contributed significantly to the first set of deliverables of the project,” said Francisco M. De La Vega, Ph.D., Applied Biosystems’ Distinguished Scientific Fellow and Vice President for SOLiD Bioinformatics. “The ability to confidently identify single nucleotide polymorphisms and structural variants as we performed for the Project, confirms that the SOLiD System is ideally positioned for studying the role of human genetic variation in health and disease.”
Other institutes supporting the 1000 Genomes Project include the Broad Institute of MIT and Harvard, the Genome Sequencing Center at Washington University School of Medicine, the Beijing Genomics Institute in Shenzhen, and the Max Planck Institute for Molecular Genetics, as well as other commercial entities with next-generation sequencing technologies.
The 1000 Genomes Project data generated on the SOLiD System by the HGSC and Applied Biosystems is currently available in the National Center for Biotechnology Information (NCBI) Short Read Archive via ftp. The first data release representing the preliminary analysis of four genome sequences are now available to download through the EBI FTP site and the NCBI FTP site.
The SOLiD System is widely used around the world in research laboratories, genome centres, core and contract service facilities and biotechnology and pharmaceutical companies. Researchers are utilising the SOLiD technology for a variety of advanced genomics research, including re-sequencing for disease studies, transcriptome analysis, de novo sequencing and methylation profiling. The newest release of the platform – the SOLiD 3 System – offers unparalleled throughput per sequencing run at 40 gigabases per run, the highest data accuracy at 99.94%, due to two-base encoding algorithms, and integrated application workflows. The SOLiD 3 System will ultimately enable scientists to sequence a human genome for less than $10,000 US dollars in 2009, with a roadmap that will increasingly drive capabilities toward the $1,000 genome milestone.
Applied Biosystems is a global leader in providing innovative instrument systems to accelerate academic and clinical research, drug discovery and development, pathogen detection and forensic DNA analysis. The technologies it markets include a robust line of DNA sequencing systems and chemistries to meet the increasing demands of the scientific community for higher-throughput, more sophisticated DNA sequencing solutions. Applied Biosystems, together, with Invitrogen – a leading provider of platform independent, essential life science technologies for disease and drug research, bioproduction and diagnostics – is part of Life Technologies Corporation, which markets the life science industry’s most comprehensive portfolio of solutions for molecular and cell biology. Applied Biosystems and Invitrogen products are used in nearly every major laboratory in the world.
Company websiteApplied Biosystems