In Vivo: November 2010

Are you tired of sequencing your gene of interest 800bp at a time? Sick of straining your eyes staring at a fuzzy chromatogram? Fed up with waiting hours for your PCR to finish only to realize you forgot to add ddNTPs to your reaction mix? Well, nex-gen DNA sequencing is for YOU!

Cheesy sales-schtick aside, next-generation DNA sequencing technologies are on the rise (and are poised to soon become current-gen technologies - some already are!). The days of loading your PCR'd sequencing mix onto an automated capillary sequencer may soon be numbered. So, to make the change of power of our mighty sequencing overlords a little easier, this series of posts will be dedicated to how the upcoming technologies work, their advantages over conventional sequencing technologies, and their problems.

Today's post: Pyrosequencing

Pyrosequencing is perhaps the one Next-Gen sequencing technology that is the most like current generation automated sequencing (if you need a reminder on how that works, I've written about it in the past). It still requires primer design, and rounds of PCR, but the method of detection of incorporated nucleotides differs.

Pyrosequencing starts by adding your PCR'd sequence to a reaction mix that contains DNA polymerase, and three important enzymes: DNA sulfurylase, apyrase and luciferase. DNA sylfurylase is an enzyme that converts a pyrophosphate molecule into ATP. Luciferase is an enzyme which uses ATP to convert luciferin into oxyluciferin, resulting in the emission of light. Apyrase's job is to degrade unincorporated nucleotides. Given that the addition of a nucleotide into a growing DNA strand results in the release of a pyrophosphate molecule, keen readers might already see how pyrosequencing works.

With conventional automated chain-termination sequencing techniques, we add all of our dNTPs to the reaction mix at once; they can easily be distinguished because each base has a different fluorescently labelled dye bound to it. With pyrosequencing, however, one cannot toss in all the nucleotides at the same time. Rather, we have to add each one sequentially for each nucleotide in the sequence. That is, we first do a reaction with A, then with T, then with C and lastly with G. We then repeat this cycle over and over until the sequencing reaction is complete (if this sounds confusing, hopefully the diagram and video below will clarify it). Why we do this will be apparent in a moment.

When a nucleotide is incorporated into the DNA strand, a pyrophosphate molecule is liberated¹. This pyrophosphate molecule can then be converted by the DNA sulfurylase into a molecule of ATP. Luciferase picks up this ATP² molecule and uses it to catalyze the conversion of luciferin to oxyluciferin. This chemical conversion results in the emission of a photon. This chain of events, then, means when a nucleotide is added by DNA polymerase to the growing DNA chain, we get the emission of light. A computer with a suitable detector could detect this light, and we would have an indication of when a nucleotide was added in our sequencing reaction.

But if we add all of our dNTPs at once, then how do we know which ones are being ? This is why we only add one nucleotide to the mix at a time. First the reaction is run using dATP. We then add apyrase to remove any remaining nucleotides and repeat with dCTP, and so on. If we add the nucleotides one at a time, then they will be incorporated (or not incorporated) into the sequence one at a time, and consequently we get one light signal at a time.

The light signals are recorded on a chart called a pyrogram. This chart records not only which nucleotides resulted in the emission of light but also the intensity of that signal. If three dTTP molecules were incorporated at once, then there would be three photons emitted and three times as much light; this would result in a triple peak on the pyrogram. From the pyrogram, then, one could easily read off the exact DNA sequence. The figure to the left shows one such pyrogram. The sequence in this example would be GCAGGCCT.

The following video puts the whole process together nicely.

So why would one choose pyrosequencing over automated chain-termination methods? Well, for one, it's cheaper (though, not as cheap as some of the other upcoming next-generation sequencing methods). Practically, it's easier to do, since it doesn't require running through gels or capillaries. Analyzing the resulting pyrogram is also easier than analyzing a chromatogram. Chromatograms can often be spotted with "N"s when the computer cannot tell if the wavelength of light from a dye is one color or the other; however, with pyrosequencing, detection is binary - either a photon is emitted or not - so results are more accurate and clearer.

Though, pyrosequencing does have it's drawbacks. It requires a greater number of PCR cycles than traditional sequencing does, so it may take longer to complete, especially for longer sequences. Currently, a typical read of sequence data from pyrosequencing is about 300 to 500bp - shorter than the typical 800 to 1200bp you get from chain-termination methods. This, however, is likely to improve as the technology advances and becomes more refined. The shorter reads, though, make it tough to sequence genomic regions containing high amounts of repetitive DNA.

So that is pyrosequencing in a nutshell. Next time: Helicos sequencing!

****************************************

1. NTPs are triphosphates (that's what the TP stands for!), meaning each nucleoside (base+sugar) is attached to a phosphate molecule. To add a nucleotide to a growing DNA strand, the reaction requires an input of energy. This energy comes from the breaking of the triphosphate chain in each nucleotide; two phosphates are broken off and released as a pyrophosphate (PP_i) molecule, and the remaining portion of the nucleotide is attached to the hydroxyl group on the 3' carbon of the previous nucleotide in the sequence.

2. Observant readers might be confused here. Since luciferase requires ATP to convert luciferin to oxyluciferin, won't this cause a problem when we add dATP to the sequencing reaction? Won't there be competition between DNA polymerase and luciferase? Well, if you thought that, then you'd be right! For this reason, we use dATPαS instead of dATP for pyrosequencing. This molecule has a sulfur atom attached to the α phosphate of the nucleotide, and cannot be used as a substrate by luciferase. Problem solved!

Thursday, 25 November 2010

Next-Generation DNA sequencing: The Future is Now! Part 1: Pyrosequencing

Sunday, 14 November 2010

Dawkins' Answers Some Questions

Sunday, 7 November 2010

"Science knows it doesn't know everything. Otherwise, it'd stop."