Some lab results finally came out, so I guess it's time for write-ups.
Contrary to what a number of players may think, player-proposed labs are most of time not simply a silly experiment of the form "hey doods, can we actually do this random shape in the lab?". Amazingly, most lab admins are smart enough to propose targeted labs, which have a meaningful purpose and try to help answer a real question, or maybe determine a set of rules or properties. Ok, ok, </end_of_rant>
The Don Quixote lab was an attempt at understanding what had happened in Semicircle 2 bends. I had presented the "theoretical" background of my ideas in the A-minor blog post. Briefly, I posited that the 3' tail with all its unpaired Adenines could possibly wrap itself around other stable stacks, forming A-minor motifs and prevent the barcode hairpin from forming by keeping its ends separated by a distance.
Considering the positive results of the various pseudoknots labs I had previously proposed, it came to my mind that we could attempt to lock the 3' tail in one such structure, thereby reducing its suspected deleterious effect elsewhere. Well, at least I thought it was worth a shot :D
My personal views of the results are following:
- this lab run failed to give a definitive answer as to whether sequestering the 3' tail with a pseudoknot was decisive or helpful in stabilizing barcodes
- but it made it clear that it is quite dangerous to mess (too hard) with the reverse transcriptase, and interestingly, the designs failing for that reason can be separated in essentially two different classes.
Experimental data review
The very first step for reverse transcription is the binding of a complementary DNA strand to the 3' tail of the RNA designs. From there, the reverse transcriptase enzyme (RT) synthesizes a complementary cDNA sequence by adding nucleotides (DNA ones, notice the lighter shade of blue for the Thymine in the picture above) in the 3' to 5' direction.
3' tail duplexes
Following designs present a high level of reactivity errors (sometimes at all nucleotide positions), and the reason seems simple:
In this case, it is rather clear and obvious that one of the segments indicated by white arrows, will preferentially bind to the 3' tail, thereby most likely preventing the DNA complementary strand from binding at the same location.
Same story (thanks to Brourd for pushing the limits and trying this).
More generally, the results in this lab would tend to show that:
- a 4 bp long pseudoknot doesn't form (with the specific sequence constraints in this lab)
- a 10 bp long complementary sequence is just too strong and doesn't allow for the RT primers to bind properly
- lengths between 5 and 7 bp resulted in a pseudoknot forming (those bases are protected), and still allowed for data collection (the RT primers did successfully bind, otherwise we wouldn't have any data, or only very little with very high error rates)
Once the reverse transcription primers (that's how these pieces of DNA are called) are attached to the RNA, the reverse transcriptase enzyme starts its job, parsing the RNA from the 3' end to the 5' end. In the absence of any special event, it would create a complete cDNA copy of the RNA design. But, some things may stop it before it reaches the 5' end of the sequence.
The first possible and rather expected event is when the enzyme reaches a nucleotide that has been modified by the SHAPE probe.
At this point, the enzyme stops and leaves behind a cDNA that is shorter than it should be. And the length of this cDNA is precisely what's measured at a later point in time.
Another (quite logical) reason why the enzyme may stop its work prematurely (or may be severely delayed) is simply when it encounters a stack that it hardly can or even cannot break open.
The 6 GC pairs long segment in the lower-left area was probably the source of all the troubles in this design.
And Vinnie's "brilliant" mini-Xmas tree here didn't turn out catastrophic, but still impeded seriously on the performance of the design.
To be honest, I never found anywhere a clear explanation for this constraint:
... but, considering the above-mentioned facts, can't you think of something now?
Globally, there were very few failing barcode hairpins in this lab, and the causes for those that did fail to form properly are all traceable back to a general weakness of the designs: low GC count, high GU count, etc.
Also, a good number of designs were submitted which didn't attempt to form a pseudoknot with the 3' tail, and their barcode hairpins had little to no problems either. This fact brings up the question of whether the pseudoknot had any effect at all. The two important differences between this lab and the Semicircle 2 bends one were the presence of an additional stem and in many but not all cases, a pseudoknot targeted for the 3' tail. Maybe the pseudoknots helped, maybe not.
So, what can be taken from the experiment? If anything, we can at least say that a "reasonable" pseudoknot targetting the 3' tail appears to be viable. The RMDB data indicates that most designs had a medium signal to noise ratio.
Don Quixote was a simple test. From the beginning (after seeing the troubles in Semicircle 2 bends), the idea was the one pictured below.
Essentially, replace the UUCG apical loop by a pseudoknot holding the 3' tail locally.
In EteRNA, it could look like this:
And I believe that this idea would be very powerful, if combined with some flexibility. In all following cases, it would be required that:
- bases 1-84 are measured, 85-end are not (as usual)
- but only bases 1-75 participate in the scoring
Then, lab admins could be given the choice to work with different barcode "styles".
Elongated, the stack including the barcode would probably give more options to the designers.
Or perhaps a bulge would allow for more variety, who knows.
Or even, a weak local pseudoknot, for the paranoids fearing (quite possibly imaginary) A-motif interactions :P