If you are looking to invest in Next Generation Sequencing (NGS) Technology, this guide will provide you with the essential information you need to assist your decisions.
Learn about the key platform technologies, considerations for sample preparation, NGS software, key application areas and the future for NGS.
DNA sequencing has advanced significantly since the launch of the Human Genome Project in 1990. A human genome can now be sequenced in under 10 days for less than $1,000, and soon will be sequenced routinely within a day. This remarkable process is due to significant advances in DNA sequencing, from Sanger sequencing – which has been dominant for almost 30 years – to next generation sequencing (NGS, also termed massively parallel sequencing).
Sanger sequencing, which is often considered first generation sequencing technology, utilizes capillary electrophoresis to separate fragments of DNA by size and then sequences them by detecting the final fluorescent base on each fragment. This widely adopted technology is still extremely important today, but has always been hampered by inherent limitations in throughput, scalability, speed and resolution.
The limitations associated with Sanger sequencing have catalyzed the development of NGS technologies, which can inexpensively and quickly produce large volumes of sequence data. NGS enables rapid sequencing of large stretches of DNA base pairs spanning entire genomes, with some instruments capable of producing hundreds of gigabases of data in a single sequencing run. The read length – the actual number of continuous sequenced bases – is much shorter in NGS than that attained by Sanger sequencing, and at present NGS only provides 50-500 continuous base pair reads. Short reads represent the major limitation currently associated with NGS.
NGS technology is evolving at an unprecedented speed. Scientists can now routinely examine a single genome a large number of times, observe individual changes, study population variations, study the microbiome and metagenomics, differentiate cancer genomes from healthy genomes, study the epigenome, and investigate the possibility of personalized medicine, among other applications.
There are several main suppliers of next generation sequencing instruments and they all share the same fundamental sequencing process, but with varying technologies. Regardless of their method of arrival, next generation sequencers rely on the generation of representative, unbiased sources of nucleic acid templates from the complex genomes being interrogated. Clonally amplified DNA templates, or single DNA molecules, are sequenced in a massively parallel fashion in a flow cell. The sequencing is conducted in either a stepwise iterative process or in a continuous real-time manner. In this way, the instruments allow for the sequencing of up to billions of individual DNA templates in a single reaction.Ion Torrent™ Technology
Life Technologies’ Ion Torrent™ Technology directly translates chemically encoded information (A, C, G, T) into digital information (0, 1) on a semiconductor chip, similar to the one you might find in your digital camera. The Ion Personal Genome Machine™ (PGM™) sequencer and Ion Proton™ System essentially act as the world’s smallest solid-state pH meter to determine DNA sequences. The DNA is fragmented, attached to beads and deposited in millions of wells across the surface of the chip. The wells are then sequentially flooded with one nucleotide after another. If a nucleotide is incorporated into the strand of bead-bound DNA, a hydrogen ion is given off, a chemical change is measured by an ion sensor beneath the well, and a base is called. This process takes place in millions of wells simultaneously, enabling sequencing in only a few hours.
Figure 1: Life Technologies Ion PGM™ SequencerLigation Technology
Life Technologies also manufacturers SOLiD® NGS systems, which use a sequencing by ligation technology. SOLiD® stands for Sequencing by Oligonucleotide Ligation and Detection. DNA ligase is used to determine the underlying sequence of the target DNA molecule. A fluorescently labeled di-based probe hybridizes to its complementary sequence adjacent to the primed template. The dye-labeled probe is then joined to the primer following the addition of DNA ligase. Non-ligated probes are washed away, and the ligated probe is identified using fluorescent imaging. Each base is effectively probed by two independent ligation reactions using two different primers. The 5500 W Series Genetic Analysis Systems are NGS platforms that support a wide range of research applications, such as exome sequencing and RNA-Seq on a pay-per-lane basis.
The MiSeq and HiSeq Platforms, and the other available Illumina systems, use SBS technology. Sequencing templates are immobilized on a proprietary flow cell surface that is designed to present the DNA in a manner that facilitates access to enzymes, while ensuring high stability of surface bound template and low non-specific binding of fluorescently labeled nucleotides. Solid-phase amplification creates up to 1,000 identical copies of each single template molecule in close proximity, generating a cluster, and because this process does not involve positioning of beads into wells or mechanical spotting, much higher densities are achieved.
SBS technology uses four fluorescently-labeled nucleotides to sequence the tens of millions of clusters on the flow cell surface in parallel, using a proprietary reversible terminator-based method. This enables detection of single bases as they are incorporated into growing DNA strands. Since all four reversible terminator-bound dNTPs are present during each sequencing cycle, natural competition minimizes incorporation bias. The result is base-by-base sequencing that enables highly accurate data for a broad range of applications.
Figure 3: Illumina HiSeq 2500 Sequencing SystemSingle Molecule, Real-Time (SMRT®) Technology
The PacBio RS II is a Single Molecule, Real-Time (SMRT®) DNA Sequencing System by Pacific Biosciences. SMRT technology, in which DNA polymerase attaches itself to a strand of DNA to be replicated, examines the individual base at the point it is attached, and then determines which of four building blocks, or nucleotides, is required to replicate that individual base. After determining which nucleotide is required, the polymerase incorporates that nucleotide into the growing strand that is being produced. After incorporation, the enzyme advances to the next base to be replicated and the process is then repeated.
Figure 4: Pacific Biosciences PacBio RS IIRoche:
In October 2013, Roche announced that it would be shutting down its 454 Life Sciences sequencing operations. GS Junior System and the GS FLX+ System utilize 454 Sequencing technologies. The 454 Sequencing™ process uses a sequencing by synthesis (SBS) approach to generate sequence data. Until the business is fully shut down – scheduled for mid-2016 – Roche will continue to provide service and support to 454 instruments, parts, reagents and consumables.Nanopore Technology
Nanopore technology is an exciting new method currently being developed. The technology records characteristic changes in electric current as nucleic acids pass through a synthetic or protein nanopore and will theoretically allow sequencing of a complete chromosome in one step, without the need to generate a new DNA strand. Read more about nanopore technology in the Future of NGS section.
One of the bottlenecks for NGS is the amount of time and resources required for library preparation; this is true whichever sequencing instrument you choose. While every sequencer uses a slightly different technology, the methods for template construction and library preparation are pretty much the same, except for minor modifications made before the run.Manual Library Preparation
Success on any NGS platform begins with optimal sample preparation – from sample isolation and purification to library construction and enrichment. As with any scientific methodology, it is well understood that the quality of sequencing data is highly dependent upon the quality of the sequenced material. Reagent kits that simplify and standardize the process of converting a DNA sample into a sequencing library and, if desired, prepare it for multiplexing, can be purchased from both sequencer manufacturers and third-party vendors.
There are many commercially available purification kits, which can be used to extract DNA and RNA from a diverse range of sample types. It is important to choose a kit that will enable you to obtain high yields of pure DNA or RNA for your NGS workflow. Throughout NGS library preparation, you will need to ensure you have established methods to determine the DNA quantity and quality. Typically, DNA is quantified using a UV/VIS spectrophotometer and its purity assessed by visualization on an agarose gel. Denovix recently launched the DS-11FX+ all-in-one Spectrophotometer/Fluorometer for rapid and accurate 1µL UV-Vis quantification. KAPA Library Quantification Kits provide qPCR-based quantification of NGS libraries prior to pooling for capture or flow cell amplification.
Current methods for NGS library preparation generally consist of a distinct DNA fragmentation step, followed by a fragment ‘cleanup’ step. Following the reaction cleanup, it is necessary to choose library fragment size and separate free adaptors from the desired product. Size selection, which is dependent on your instrument’s requirements and your application, has traditionally been performed using gel extraction protocols. However, there are now gel-free methods available to simplify this process. Learn how the Sage Science SageELF Whole Sample Fractionator can automate DNA fragment size selection for NGS library construction in this poster.
Most imaging systems have not been designed to detect single fluorescent events, so the adapter-ligation reaction is typically amplified to produce the final product ready for cluster formation and sequencing. The two most common methods for amplification are emulsion PCR (emPCR) and solid-phase amplification.
Because NGS platforms employ different methods, many of the commercially available NGS kits are designed for a specific platform or application. A large number of standard library preparation kits offer protocols for sequencing whole genomes, mRNA, targeted regions such as whole exomes, custom-selected regions, protein-binding regions, and more. Some protocols are designed to require low sample input for library generation, and can be used for single cell analysis. This poster demonstrates the use of the SMARTer Ultra Low Input RNA Kit from Takara Clontech for the production of accurate, full-length, unbiased cDNA. Agilent Technologies has a range of kits for library preparation, including the new SureSelect Focused Exome, which enables analysis of only the disease-associated targets.
The analysis of protein-nucleic acid interactions via Chromatin Immunoprecipitation Sequencing (ChIP-seq) can be used to interpret epigenetic events involved in biological processes. ChIP-seq kits such as the Chromatrap® ChIP SEQ Kits, from Porvair Sciences, enable a streamlined protocol to high quality DNA for NGS library preparation.Automated Library Preparation
It is also possible to automate the whole library preparation workflow. Automation of sample and library preparation increases sample throughput and reproducibility, while eliminating labor-intensive manual steps and costly user errors. A range of automated and semi-automated products, including microfluidics-based instruments and liquid handling robots, are available to assist your library construction. Learn how the Labcyte Echo® 500 Series Liquid Handlers, which use acoustic energy to transfer precise volumes of samples and reagents without touching them, provide a fast method for Illumina library preparations in this poster. The Tecan Freedom EVO NGS workstation incorporates a TouchTools operator interface to guide users through each step of protocol selection and worktable set-up, as shown in this video.
It is important to choose your system and kits based on both your instrument and your application. Consider batch size, the type of applications that will be performed, walk-away time, quantity of sample material, reproducibility, turn-around time, as well as budget and running costs.
Figure 6: QIAGEN GeneRead Library Prep Kits can be can be used in conjunction with the QIAcube for complete automation of the library preparation workflow.
Figure 7: See how the Biomek 4000 Laboratory Handling Workstation from Beckman Coulter can also be used to automate NGS sample preparation workflows, including nucleic extraction, library construction, size selection, PCR/qPCR setup, PCR cleanup, normalization and pooling.
Figure 9: PerkinElmer Sciclone NGSx Workstation
Another major bottleneck of whole-genome and whole-exome sequencing projects is not the sequencing of the DNA itself, but is in the structured way of data management and the sophisticated computational analysis of the experimental data. Biologists are rarely trained in the computational and statistical techniques necessary to make sense of the large data sets generated by NGS.
The complete NGS data analysis process is complex, includes multiple analysis steps, is dependent on a multitude of programs and databases, and involves handling large amounts of heterogeneous data. Data is produced at a rate faster than most computers can handle, and this has forced researchers to not just rethink software solutions, but also to consider data storage, processing power and data output.
Commercially available NGS software solutions might be delivered via desktop software or by the use of web-based interfaces.Desktop Software
Typically, commercially available solutions for NGS aim to simplify analysis by providing easy-to-use graphical user interfaces (GUI). Such software tools may be a suitable entry point for small-scale laboratories, especially for analysis of simple datasets, but are generally limited in their flexibility and scalability, and often do not adequately resolve issues around data handling and management. It is also important to remember that many challenges around NGS analysis are still being resolved; commercial software packages are not exempt from common issues experienced with analyzing NGS data and may not be as advanced as the open-source tools being developed by large genome centers.
Figure 10: CLC bio CLC Genomics Workbench
When looking at software options, it is important to consider data management and data storage. Volumes ranging from 120 to 600 gigabytes will need to be managed and stored. The initial investment in the NGS platform is often accompanied by an almost equal investment in upgrading the informatics infrastructure of the institution, hiring staff to analyze the data produced by the instrument, and storing the data for future use. This cost is often not anticipated by the researcher.
It is advantageous to have a centralized Bioinformatics Core to put in place platforms that acquire, store and analyze the very large datasets created by NGS instruments. A Bioinformatics Core, already familiar with data of this type and complexity, dedicated to investigators, and jointly working with IT personnel, can span multiple domains rather effortlessly. If this is not a possible solution, you may wish to consider cloud computing. In cloud computing, a user can utilize a virtual operating system (or ‘cloud’) to process data on a computer cluster for high parallel tasks.Web-Based Interfaces and Cloud Computing
Several commercial players, such as GenomeQuest and DNAnexus, offer web-based browsers that manage all of the data coming from a NGS machine. This enables the researcher to work without the need for local computer infrastructure. The browser facilitates the management, analysis and delivery of genomic data through a secure cloud platform, which supports unlimited storage and computational resources. Cloud computing removes the hardware required for complex projects, allowing a faster set-up time and the ability to run multiple large projects in parallel. Large-scale data generated by NGS technology can be analyzed and stored alongside completed projects in the cloud. Cloud services can be selected to meet the users computational and storage requirements.
Figure 11: DNAnexus DNAnexus Platform
NGS technology is moving at an extremely fast pace, so much so that some researchers are unwilling to invest heavily in technology that might soon be outdated. Others may not have the time to complete sequencing projects. For these researchers, the use of a service provider might be an attractive option.
The expertise offered by service providers can enable a rapid turnaround time for efficient completion of projects, with sequencing performed under accredited conditions to ensure reliable results. Service providers offer a range of sequencing applications, including whole genome sequencing, de novo genome sequencing, exome sequencing, targeted resequencing, de novo transcriptome sequencing, RNA-seq, small RNA sequencing, microbiome sequencing, metagenomic sequencing, and metatranscriptome sequencing, and may offer multiplatform sequencing strategies. Experience in these sequencing applications has also helped service providers to offer specific and efficient protocols, such as those requiring low DNA input.
Using a service provider, researchers can submit their samples, which will be analyzed by the provider, who will then return the data. Researchers then only require suitable software to enable them to analyze and store the results. Some service providers now offer pre-sequencing and bioinformatics services in addition to the above. For example, GATC’s INVIEW™ portfolio, such as its INVIEW™ Human Exome in figure 12, offers services including DNA isolation, library preparation, amplification of target region, sequencing and data analysis through to delivery in common files or providing access to web-based analysis software like QIAGEN’s Ingenuity® Variant Analysis™. Service providers are also starting to offer secure cloud computing of customer data.
Figure 12: The INVIEW™ Human Exome all-in-one service from GATC Biotech provides fast exome sequencing results for identification of relevant disease-causing mutations.
NGS technology is evolving at an unprecedented speed. Scientists can now routinely examine a single genome a large number of times, observe individual changes, study population variations and metagenomics, differentiate cancer genomes from healthy genomes, and study the epigenome. NGS technology has the potential to revolutionize the field of companion diagnostics and personalized medicine, specifically in the area of oncology and cancer diagnostics, where customized treatment and therapy decisions are based on individual genomic data, targeting molecular changes in cancer cells. NGS also has applications in inherited disease testing, virology and microbiology.
There is currently only one FDA approved NGS analyzer, the MiSeqDx. Read the press release of this FDA announcement here. In October 2014 the Ion PGM Dx NGS System was CE-Marked for IVD use in European countries . The other commercially available NGS instruments are being used in the life science arena for clinical research, and in clinical diagnostic laboratories where regulations allow. The manufacturers of these instruments are engaging with the regulatory bodies to determine the best way in which these analyzers can be utilized for diagnostic use.
There are a number of challenges to implementing NGS into the clinical laboratory. These include, among others, achieving a sufficiently simple and reproducible workflow, standardization of data formats, association of mutations with clinical relevance, unclear pathway to regulatory approval, and most significantly, laboratories need to learn how to interpret and analyze the enormous amounts of data being collated.
Microbiome studies analyze microbial communities found in the human body and are important for elucidating the role of microbes in health and disease, as discussed in the following videos about the American Gut and Human Microbiome Project . The use of NGS provides a cost and time saving method for studying complex microbial samples, requiring just a single sample to analyze the entire microbial community and eliminating the need for microbial cultivation. The diversity (phylogeny and taxonomy) of complex microbial samples can be analyzed by sequencing of conserved genomic regions, such as the 16S rRNA gene or internal transcribed spacer (ITS) in bacterial and fungal samples respectively. The INVIEW™ Microbiome NGS service from GATC Biotech offers a sensitive method to identify complex populations by sequencing multiple hypervariable regions of the 16S rRNA as well as ITS regions. For highest taxonomic resolution, full-length 16S amplicon sequencing is also available.
Figure 13: Prof.Joseph F Petrosino, Baylor College of Medicine, discusses the Human Microbiome Project and microbiome research using Next Generation Sequencing.
NGS technologies have gained the capacity to sequence gigabases of DNA in a high-throughput and highly efficient manner that has not been possible using traditional Sanger sequencing. Compared to traditional sequencing, the read lengths of current NGS approaches are relatively short, which is due to the small sequencing colonies and rapid signal deterioration. This is compensated for by its highly-parallel fashion. Technical and chemical refinements are gradually increasing read lengths in NGS, but only novel technologies will be able to provide substantially longer reads. Since single DNA molecule sequencing technology can read through DNA templates in real time, without amplification, it provides accurate sequencing data with potentially long-reads.
Consequently, novel third generation platforms, with read-lengths as a focus, are currently under development. These new instruments are anticipated to be significantly faster than current technologies, enabling genomes to be sequenced at a lower cost. In addition, new kits and reagents will continue to emerge that will enable NGS to be used for a wider range of applications. The protocols required for library preparation are likely to become more simplified and automation will continue to facilitate more streamlined workflows.
Nanopore sequencing is an exciting new method that is likely to be incorporated into some third generation sequencers. In nanopore sequencing, a DNA strand is processed through a synthetic or protein nanopore and the subsequent changes in the electric current allow identification of the base passing the pore. This will theoretically allow sequencing of a complete chromosome in one step, without the need to generate a new DNA strand.
Oxford Nanopore Technologies is developing a range of protein nanopore-based electronic systems to analyze single molecules such as DNA, RNA and proteins. The technology, which is incorporated in the MinION™ portable device and PromethION™ desktop system (not yet available to the market), sequences DNA via the measurement of characteristic ionic current disruptions, as each of the DNA bases on an intact single-stranded DNA polymer passes through a protein nanopore. A high-throughput array chip design enables multiple simultaneous measurements. Read lengths of many tens of kilobases and bioinformatic analyses can be completed in real-time for DNA sequencing applications including resequencing, de novo sequencing and epigenetics.
Learn more about Oxford Nanopore Technologies and Nanopore technology here.
Despite still being in its infancy, NGS has already tremendously changed the landscape of biological research and has begun to engage with the clinical practice. In the next few decades, it is anticipated that genomic medicine, driven by NGS, will profoundly change the diagnosis, prognosis, and therapy of human diseases.
Using NGS for personalized medicine is the ultimate goal for many. There are, however, a number of challenges that must be adequately addressed before NGS can be transformed from a research tool to a routine clinical practice. Rapid interpretation of the masses of data produced currently requires highly specialized software, and represents one of the biggest obstacles in bringing whole genome sequencing routinely to the clinic.
"The QIAcube is very helpful for minipreps and gives you time to do something else. It is also nice for gel extractions..."
Marco Vilani, Novartis
"The software’s interface is extremely simple and has been designed to follow the different stages of the whole process. It is thus very easy to create one’s own protocols step by step..."
Stephane Roy, IntegraGen