Difference between revisions of "User talk:ElNando888/Blog/SHAPE?"

From EteRNA WiKi
Jump to: navigation, search
(Elaboration on variations in SHAPE scores.)
(replying, and btw, thoroughly enjoying the conversation :))
Line 35: Line 35:
 
<p>And for the reproducibility and verifiability, I need time (always a rare resource), and tools, so I probably need to get my head into your software, rather sooner than later ;)</p>
 
<p>And for the reproducibility and verifiability, I need time (always a rare resource), and tools, so I probably need to get my head into your software, rather sooner than later ;)</p>
 
<p>-- [[User:ElNando888|ElNando888]] ([[User talk:ElNando888|talk]]) 21:37, 22 July 2013 (UTC)</p>
 
<p>-- [[User:ElNando888|ElNando888]] ([[User talk:ElNando888|talk]]) 21:37, 22 July 2013 (UTC)</p>
 +
<p>----</p>
 
<p>As an example of what I consider to be typical, here's the SHAPE results from the barcode hairpin for a lab that I just happened to have open (Triloop Buffet).</p>
 
<p>As an example of what I consider to be typical, here's the SHAPE results from the barcode hairpin for a lab that I just happened to have open (Triloop Buffet).</p>
 
<p>[[File:Diversity_of_SHAPE_values.PNG]]</p>
 
<p>[[File:Diversity_of_SHAPE_values.PNG]]</p>
Line 40: Line 41:
 
<p>Now I have been thinking that ensemble diversity couldn't account for this variation, because (I presumed) there must be large numbers of copies (thousands? -- I really don't know) of the RNA molecules present for each design.&nbsp; So while any one molecule might stay in a particular configuration for many seconds, there would be enough molecules that when averaged over all the instances, the error due to ensemble diversity would be small.</p>
 
<p>Now I have been thinking that ensemble diversity couldn't account for this variation, because (I presumed) there must be large numbers of copies (thousands? -- I really don't know) of the RNA molecules present for each design.&nbsp; So while any one molecule might stay in a particular configuration for many seconds, there would be enough molecules that when averaged over all the instances, the error due to ensemble diversity would be small.</p>
 
<p>Do I need to rethink this?</p>
 
<p>Do I need to rethink this?</p>
 +
<p>----</p>
 +
<p>I don't pretend to know everything that is to be known about SHAPE and RNA, and for instance, I'm a lot less familiar with UUCG than I am with GNRA. But examining the example(s) you're providing, I would observe that it's probably extremely difficult to draw conclusions when you're lacking part of the data. In the case discussed on this wiki page, the whole stacks and the multiloop are clearly visible in the SHAPE data. For the hairpin loop you're presenting, you're missing a very important amount of informations, namely an entire half of a (presumed) stack.</p>
 +
<p>[[File:CL15_SHAPE_UCUUCGGA.png]]</p>
 +
<p>This picture includes all results with the pattern UCUUCGGA, and it also seems to me that many records show clear signs that the stack didn't form as planned, even for highly scored designs...</p>
 +
<p>Restricting the view to the pattern itself may also miss important global clues. In this lab for instance, which I haven't analyzed in detail, I know that the basic 2D structure is supposed to create a specific 3D pattern, namely a double coxial stacking. Instead of a cross, the result is typically 2 helices which are bound by their center. Often, these helices are even parallel and engage in a large number of tertiary interactions which cannot exclude the formation of pseudoknots or kissing hairpins. Given the proximity of the barcode with this structure, I believe we also can't exclude that the barcode may have interacted with the central shape in some cases.</p>
 +
<p>For all these reasons, I would be extremely careful before drawing conclusions about the results of this lab.</p>
 +
<p>-- [[User:ElNando888|ElNando888]] ([[User talk:ElNando888|talk]]) 07:27, 23 July 2013 (UTC)</p>

Revision as of 07:27, 23 July 2013

General

Great topic! I agree with the basic premise that the SHAPE results are affected by numerous things besides the presence or absence of the three base pairings that the game's energy model acknowledges.  Working to figure out those additional factors is perhaps the most cool aspect of EteRNA.

Omei (talk) 20:34, 21 July 2013 (UTC)

Toolset

What tools did you use to create these images?  Something other than RNA Composer to predict the 3D structure?  And the renderings don't look like options I have seen in Chimera, though there could be plugins I don't know about.  I'm especially interested in what you used to predict stacking interactions.

Omei (talk) 20:34, 21 July 2013 (UTC)


The 3D renderings are no simulations, they are all segments of solved structures from PDB. First, I used FRABASE to identify the PDB entries containing the sequences I was interested in. Then I fetched the structures in Chimera and worked them with it.

Atoms and "normal" bonds are easy to color any way you want, but you need to define your own color to get the the translucid white I applied on the backbone (select/structure/backbone/full). For the hydrogen bonds, I currently like to make them look like springs, but it's an option relatively hard to access: tools/general controls/pseudobond panel/hydrogen bonds/attributes/component pseudobond attribute/bond style

And you're right, there are no built-in command to detect and represent stacking interactions. First, and as I was saying, it took me a very long while (months) before I could find the parameters that validate a stacking interaction. I found them on the website of another 3D rendering tool, PyMol, at http://www.schrodinger.com/kb/1556

Then it's an arduous work in Chimera, but at least it's possible. There are tools in Structure Analysis which allow to define centroids, planes, axis, show or hide these objects, and also to calculate distances and plane angles. Once a pi stacking interaction is validated, I represent the interaction by defining an axis based on the atoms of both aromatic rings.

The other difficulty I had related to this topic was to ascertain whether the imidazole (pentagonal ring in purines) is aromatic or not. I don't recall exactly where I read that they actually are, but considering that reference and the various examples I saw in solved structures, I believe that they are.

-- ElNando888 (talk) 05:15, 22 July 2013 (UTC)


The 3D renderings are no simulations, they are all segments of solved structures from PDB. First, I used FRABASE to identify the PDB entries containing the sequences I was interested in. Then I fetched the structures in Chimera and worked them with it.

Does that mean the association of the specific 3D structures with the 2 hairpins is simply your choice?  If so, could I equally well switch them, and attribute the lower SHAPE reactivity at 26 to the fact that the G has two hydrogen bonds with the opposing A?

I'm not claiming the latter is the right explanation; I'm just trying to better understand your chain of reasoning.

Omei (talk) 16:51, 22 July 2013 (UTC)


No, you could not switch them, because the sequences are clearly different. GGUAAC vs CGUAAG (closing pair swapped)

FRABASE returned in each case a few hits, all of them presenting a similar GNRA-like pattern (which is natural). I didn't verify if the ones I selected were more representative than the others, but I'm fairly sure that all GGUAAC looked very much alike, that all CGUAAG also looked alike, and that the main difference between the two sets was indeed to be found in the position of the first G in the loop relative to the bases forming the closing pair, in other words, their stacking relationships.

And using solved structures seems to me like the best possible option here. Would you rather trust a software simulation than whatever we can manage to measure with XRD or NMR?

-- ElNando888 (talk) 17:45, 22 July 2013 (UTC)

 


Sorry.  As phrased, that was a really dumb question. You stated very clearly that you were comparing two different sequences, but somehow I managed to forget that as I was writing my comment.

What I was actually wondering was how you had figured out that the configuration in the two 3D models you show represented the configuration the sequence took in your design.  I too have spent a lot of time looking at the 3D structure of tetraloop hairpins in PDB, and my general impression is that a 6-base sequence will take on quite different configurations (for reasons unknown), just as the SHAPE results for the same sequence in the same position in the same lab can vary widely.  So while I have hypothesized that certain 3D configurations correspond to certain SHAPE score patterns, I really have no way of confirming or denying this.  I was hoping you had come up with something to help do that. 

Have you gathered many instances of the GGUAAC and CGUAAG loops in lab to see how consistent the SHAPE patterns are?  If they are reasonably consistent, and the PDB structures are consistent for this sequence across distinct molecules (there are a lot of duplicatations of the same molecule in PDB ), then your explanation bears a lot of weight.

 


Ah, I understand better now. In that case, I say ensemble diversity. When you say:

the SHAPE results for the same sequence in the same position in the same lab can vary widely

I would tend to think that the ensemble diversity of these designs were the cause of the discrepancies. When a shape is pretty stable, like a GNRA tetraloop for instance, specific sequences will stay stable in a very specific 3D conformation.

In the case of the design I'm presenting on the page we're discussing, I have only my experience, my intuition and a few clues (the familiar tetraloop signatures, and the multiloop) that tell me that the ensemble diversity had to be very low for this design. In which case, I can trust that there were no misfolds of any kind and that the SHAPE signatures are associated with only one possible 3D structure. (sidenote: I use Vinnie to "finish" designs, precisely for creating them with the lowest possible ensemble defect)

And for the reproducibility and verifiability, I need time (always a rare resource), and tools, so I probably need to get my head into your software, rather sooner than later ;)

-- ElNando888 (talk) 21:37, 22 July 2013 (UTC)


As an example of what I consider to be typical, here's the SHAPE results from the barcode hairpin for a lab that I just happened to have open (Triloop Buffet).

Diversity of SHAPE values.PNG

I filtered the query to only include designs with an overall score of 85 or better, to rule out any gross misfolds.  So this shows the SHAPE scores for the 14 designs that were reasonably good overall and used the same UC/GA assignments for the two pairs closing the barcode hairloop.  As you can see, of the 8 positions displayed, position 76 is really the only one that is solidly consistent.

Now I have been thinking that ensemble diversity couldn't account for this variation, because (I presumed) there must be large numbers of copies (thousands? -- I really don't know) of the RNA molecules present for each design.  So while any one molecule might stay in a particular configuration for many seconds, there would be enough molecules that when averaged over all the instances, the error due to ensemble diversity would be small.

Do I need to rethink this?


I don't pretend to know everything that is to be known about SHAPE and RNA, and for instance, I'm a lot less familiar with UUCG than I am with GNRA. But examining the example(s) you're providing, I would observe that it's probably extremely difficult to draw conclusions when you're lacking part of the data. In the case discussed on this wiki page, the whole stacks and the multiloop are clearly visible in the SHAPE data. For the hairpin loop you're presenting, you're missing a very important amount of informations, namely an entire half of a (presumed) stack.

CL15 SHAPE UCUUCGGA.png

This picture includes all results with the pattern UCUUCGGA, and it also seems to me that many records show clear signs that the stack didn't form as planned, even for highly scored designs...

Restricting the view to the pattern itself may also miss important global clues. In this lab for instance, which I haven't analyzed in detail, I know that the basic 2D structure is supposed to create a specific 3D pattern, namely a double coxial stacking. Instead of a cross, the result is typically 2 helices which are bound by their center. Often, these helices are even parallel and engage in a large number of tertiary interactions which cannot exclude the formation of pseudoknots or kissing hairpins. Given the proximity of the barcode with this structure, I believe we also can't exclude that the barcode may have interacted with the central shape in some cases.

For all these reasons, I would be extremely careful before drawing conclusions about the results of this lab.

-- ElNando888 (talk) 07:27, 23 July 2013 (UTC)

Personal tools
Main page
Introduction to the Game