This is taken from Brourd's lab, SHAPE Profile - Most Likely Base Pair C/G. The design is the one I submitted (in natural mode), and I guess you'd agree that it looks fairly straightforward and boring.
The statistics look rather good, a little more GC pairs than AU ones, no GU pairs is maybe a little excessive, but nothing one would normally worry about, and -84.4 kcal/mol, well, that should be seriously stable, right?
The graphs also would make one confident that this design has a fair chance to work well. The dot plot is very clean, and the melt plot has this "flat start" feature (highlighted in orange) that promises that the molecule isn't going to misfold as soon as the temperature climbs a little, so another promise for stability.
In EteRNA, it seems to me that the melt plot is the most mysterious of the tools and measures available to us. What's a good melt plot? What does a bad one look like? What do certain shapes in those plots mean or imply? Surprisingly to me, I found little to read about the topic...
In the melt plot above, I added a few lines and markers about things I'd like to talk about today.
I also added purple lines at 65% and 100%, because the melting point or melting temperature seems to have a different definition depending on the contexts. Here at EteRNA, the melting point seems to be the moment when no pairs can hold any longer. In the graph above, it is visible that some pairs are still predicted to be present at 97°C, and the statistics mention 107°C as melting point. Knowing that EteRNA only makes 7 sample simulations (from 37°C to 97°C by intervals of 10°C), I believe it is correct to interpret a 107°C melting point value as equivalent to the sentence: "the melting point is something higher than 97°C".
In the scientific literature though, the melting point is normally the point where the folded RNA has lost more than half its pairs. Hence the purple lines, with one positioned midway between the base line (30%) and the fully unpaired state (100% unpaired = 0% paired). According to this convention, the melting temperature (usually denoted Tm) is not higher than 97°C, it's somewhere between 87°C and 97°C. This "detail" is not very important for the points I want to make later, but I believe it is important to mention it, so that you know what is being talked about when you read about a RNA's Tm value in a scientific paper, because it is quite possibly not comparable to the Tm we see in EteRNA.
Now, you know how the EteRNA labs work, right?
- DNA templates are ordered
- they get amplified by PCR
- RNA is then synthesized
- RNA is heated to 90°C and left to cool back down to room temperature
- RNA is then probed with a chemical reagent (SHAPE, DMS, CMCT, etc)
- what the bloody hell is the point of the heating/cooling step?
- in the case of a folded sequence with a melting point near or over 90°C, what will be the effect of the heating/cooling step?
It seems to me that the whole point of the heating/cooling step is to eliminate transcriptional effects. During the synthesis, RNA may already fold while it is still being transcribed, as you probably know. Subsequently heating the RNAs is supposed to provide experimental conditions that stay comparable. In other words, it is hoped that all sequences will start from a 100% unpaired state, with 100% of the bases already present in the sequence. Is it what actually happens for all synthesized sequences?
Probably not. Take the example above. If the sequence reached the MFE after transcription, which is quite possible, a heating at 90°C won't affect it much, it will temporarily lose a few pairs and rapidly get them back as it cools down. The problem is that at the transcription stage, that sequence may also not reach the MFE, but another suboptimal and quite stable fold, and the same reasoning about the heating step could apply.
You would maybe say that the pairing probabilities dotplot shows no signs of even remotely possible misfolds...
Well, this is a suboptimal fold of that same sequence, and...
... sure, it's some 25 kcal/mol away from the MFE, but wouldn't you say that -60 kcal/mol is something quite stable?
- a suboptimal fold that is +20 kcal/mol worse than the predicted MFE is totally invisible in dot plots
- if the MFE is on the order of -25~30 kcal/cal, it doesn't matter, because the said suboptimal fold with its -5 kcal/mol free energy is most likely not a long-lived one.
- though, the lower the free energy of the MFE, the higher the risk of suboptimal folds that are both undetectable and long-lived
So, why did I leave such a "dangerous" misfold in my design if I'm aware of it? Well, this is actually intentional. This strategy for which I coined the word "backburn", would be applied precisely in cases where the predicted MFEs free energies are very low (like in this lab), and thus when potentially dangerous misfolds are hard to foresee. So instead of hoping for the best, the idea is to "design" a specific suboptimal fold, but having the property of a low energy-barrier between the suboptimal fold and the target structure. So far, I haven't been very successful with these sort of schemes, and I'm still struggling with finding a proper mechanism that would ensure an easy evolution from a specified suboptimal structure to a target structure. Possibly, you realize how this work is connected, at least conceptually, to switches...
(work in progress)