The Power of Long-Reads
Next generation sequencing, notably on the platforms commercialized by Illumina, has revolutionized the field of genomics. Not only it has brought down the cost of sequencing complex genomes, e.g. the human genome, but it also can be used widely in lesser-known fields like forensics, environmental monitoring, paleontology, and so on.
Illumina short-reads sequencing
The “sequencing-by-synthesis” strategy of NGS records every time a DNA fragment is being used as a PCR template, and the newly synthesized DNA copies are used as template in the next round of synthesis. As the copy number increases exponentially, so does the output data, allowing accurate reconstruction of the input library. This is the power of the short-read sequencing.
From: Applications of Clinical Microbial Next-Generation Sequencing
Unfortunately, the advantage is also its weakness. Many features in the DNA pools may be lost once the DNA molecules are broken down into small pieces. Think about a region with “CAG” tandem repeat that is longer than 200~300 bp, it would be extremely hard to know exactly how many repeats are present in the sample. Another example would be alternative splice forms of an mRNA, where the fragmented reads would need extensive bioinformatic effort to reassemble, short reads may not be able to capture the whole picture (see example below). Epigenetic information will also be lost during the rounds of short PCR reactions, because enzymes responsible for these modifications do not exist in the ex vivo environment.
Assembly with short reads
Short-read sequencing provide a fragmented view of the isoforms, which may be hard to reconstruct into original forms.
Image credit: Pacb.com
Entering the stage long-read sequencing. Also termed as “third-generation sequencing”, long-read sequencing technologies target the limitations of massive parallel sequencing using short reads, and are paving a unique way to delve into unchartered areas of genomic research.
Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT, or just Nanopore for short) are the two major players in the arena of long-reads sequencing. Both technologies rely on signal detection at single molecule level, and generate consensus sequences by collecting output from millions of DNA molecules. (In the case of Nanopore, RNA can also be sequenced directly.)
Pacbio Sequencing Overview
PacBio’s power comes from the continuous action of a rolling cycle amplification (RCA), where a special DNA polymerase glides on a circular template, and keeps adding fluorescently labeled nucleotides. The fluorescence emitted by these nucleotides is recorded in a movie to determine the sequence. In a PacBio SMRT cell, there are millions of “studios” – tiny wells that can host one single DNA molecule, where the DNA polymerase plays the leading role with four flashing lights, A, G, T and C. When there is a modified base, the polymerase slows down, producing a prolonged gap in fluorescence detection. Because the template remains the original molecule, this delay will be recorded in all the subreads and can be used to decipher the epigenetic map of the genome.
Source: CCS.how
In the case of Nanopore, it is a different power at play – electricity. Appropriately named, “Nanopore” is indeed tiny pores formed by special proteins. These protein pores are unique, in that they only allow one nucleic acid molecule to pass through at a time. And when they are passing, each nucleotide generates a signature ion current. By recording the electric current, one can know the sequence of the nucleotides that passed through the pore. Similar to the fluorescence delay in Pacbio sequencing, Nanopore can detect altered patterns in the current, therefore determine the epigenetic modification.
Nanopore Overview
Every DNA or RNA molecule passes through the pore and has their ion current profile recorded. Then the “profile” is converted to a sequence composed of A, G, T/U or C.
Source: Wikipedia
Many factors play into the selection of sequencing methods, including cost, throughput, accuracy, and much more. Despite long-read sequencing methods’ powerful potentials, they are still playing complementary roles to the short-read methods. And yet, with the ever-evolving methodology in technology and analysis, breakthroughs are on the horizon.
To learn more about these technologies and what we can do for you, please visit our website: www.uniproteios.com