User talk:Brourd

From Eterna Wiki

Revision as of 23:11, 10 March 2014 by Brourd (talk | contribs) (Answering a few questions)

Latest comment: 10 March 2014 by Brourd

Please use the '+' link above to start a new thread.

 

 

== Background ==

Hi Brourd,

I think it would be good to present the motivations behind the whole project. Since I'm not entirely sure I'm getting 100% of it myself, I'm going to write down what I understood, and I'd appreciate your correcting me when I'm wrong.

 

I believe both the scientists and the players would love to try designing switches again. There are issues though, related to the new pipeline, Cloud Lab.

  • it doesn't make sense to run switch-related sequences along with mono-state ones. It would represent a costly waste of resources
  • making a round with exclusively switching sequences is basically twice the work for the lab, and probably also twice the price

 

So, what can we do? Some time ago, an idea emerged, that could be used in conjunction with Cloud Lab as it is now, without disturbing the cycle too much. Essentially, the idea consists in submitting the same sequence twice, once with the FMN bindng site, once with that area mutated to something else, which we will dub as "mimic". Assuming the mutation would have the same effect as a real FMN binding with the aptamer (in other words, providing a bonus of about -4.9 kcal/mol), this system would allow us to screen for good switch candidates. And once we have accumulated enough of those, a dedicated lab run could be done, with the real FMN addition this time.

If I understand correctly our purpose in this project, our goals are:

  • first, to create good mimics
  • second, to start generating good switches

 

About the first step, I may have missed something, but I couldn't find any documents, neither on EteRNA, nor on RMDB, related to tests with mimics that Das Lab would have already made. The only thing I heard (from you Brourd) is that they tested this:

<tbody> </tbody>
Mimic 1.png Mimic 2.png

Which I would argue, doesn't look very good to me, but let's leave this topic for later.

 

And once we have (or at least, we think we have) good enough mimics, the second step sounds almost "small and easy": design targets, and sequences for those. And voila :)

-- ElNando888 (talk) 15:07, 20 February 2014 (UTC)Reply[reply]

 


 

So, the question of where the FMN mimic sequence originally started...

The FMN Mimic was actually run for 2 designs of every lab in Round 70 of the cloud lab, aka, the first and only round of FMN switches in the cloud lab so far.

As you may recall, I actually <a href="https://getsatisfaction.com/eternagame/topics/fmn_mimics_and_the_repeatability_of_switch_results" target="_blank">wrote something about it, as well as a question of the repeatability of the riboswitch scoring</a> a very, very, long time ago.

In addition, Dr. Rhiju Das <a href="http://eterna.cmu.edu/web/blog/2891462/" target="_blank">wrote a little blog post</a> about how his Ph.D students participated in the first round of the EteRNA switches, and if I recall correctly, these switches were made and scored with the FMN mimic.

All the chemical mapping data for the FMN switches and mimics are available in <a href="http://rmdb.stanford.edu/repository/detail/ETERNA_R70_0000" target="_blank">rounds 70</a> and <a href="http://rmdb.stanford.edu/repository/detail/ETERNA_R71_0000" target="_blank">71</a> on the RMDB. I'm sure you can figure out what to do from here!

 

So, back to the motivation and goals for the project, and why they were not included in the original post.

It was 1 am :P (I'll work on including that)

However, your observation is correct. The goal of the FMN mimic is to essentially create a protocol, where a multiple state RNA system based on the binding of a ligand, can be potentially tested in absence of said ligand (other multistate systems may be a tad more difficult). This protocol could potentially extend to other other aptamers as well, which would be a future goal. in addition to this, this project has two additional main goals, as well as a few secondary goals

  1. The characterization and creation of successful riboswitch constructs.
  2. The characterization and implementation of successful riboswitch design rules.
    1. Potential reworking of riboswitch scoring based on the SHAPE chemical mapping protocol.
    2. A comparison of riboswitches with canonical base pairs versus those with noncanonical base pairs.
    3. Implement a pipeline for the testing of riboswitches using the Das Lab's high throughput chemical mapping protocol.
    4. Determine if current automated algorithms can design successful riboswitches (NUPACK, ViennaUCT, any other publicly available multistate design algorithms) 
      1. (A VERY minor/secondary goal) The creation of an automated algorithm to design riboswitches, coding both the rules for constructs and sequences into it.

 

-- Brourd (talk) 22:51, 20 February 2014 (UTC)Reply[reply]

 

----

 

Thanks Brourd,

The data I can find in round 70 on RMDB indicates that only and all EteRNA players' designs were tested 4 times, with and without FMN, with 2 different chemical probes, 1M7 (SHAPE) and DMS. I see no traces of mimics in the dataset.

Round 71 (Cloud Lab round 3) was a repeat of Cloud Lab 1, so completely different sequences and constructs. Though, this batch does include the students switch constructs Rhiju is talking about. But those sequences weren't tested against the real FMN...

So, I still don't see any data that could speak about the effectiveness of the method for screening good FMN switches.

-- ElNando888 (talk) 03:32, 21 February 2014 (UTC)Reply[reply]

 

----

 

So, in round 70, annotation data 3713 to annotation data 3872. The map-seq ID has the ID, then -a, to indicate that it is a mimic. (a for alternate, maybe?)

Example

ANNOTATION_DATA:3868   modifier:DMS MAPseq:design_name:JG #1 MAPseq:project_name:Top Notch by jmf028 MAPseq:ID:2426211-a  

sequence:GGAAAUUUAAGCACAGAGGGCCUAUCUCGAAACGAGAAGGUCCUCACCAUCAAAAGAUGGAAGUGCAAGUUUACAUUCGUGUAAACAAAAGAAACAACAACAACAAC

structure:..........((((.((((((((.(((((...))))))))))))).(((((....)))))..))))..(((((((....))))))).....................

signal_to_noise:weak:0.325 MAPseq:tag:FAM-RTB003 chemical:FMN:200uM

 

The FMN mimic is in bold in the sequence.

*minor note* the Das lab never actually published the results in EteRNA, so maybe they thought the results were a bust? lol

Which is good news, since that means that this project's, work to chance of failure ratio, has just increased, what fun! :)

---Brourd (talk) 04:22, 21 February 2014 (UTC)Reply[reply]

 

----

 

Ok, I think I finally found it, thanks for the hints.

The mimicking sequences are located in http://rmdb.stanford.edu/site_media/rdat_files/ETERNA_R70_0000/ETERNA_R70_0000.rdat and the annotation data span from 3713 to 3872, just as you indicated. Those 160 data points are for 40 sequences (2 designs were selected from all 20 labs), and those sequences underwent the same protocol as the others, tested with and without FMN, probed with 1M7 and DMS. Here a note related to what I was saying earlier about waste of resources: these mimicking sequences didn't need to be tested with FMN, 80 slots were "wasted"...

Unfortunately, the data associated with the mimics is only in RMDB, not in EteRNA, and it's a little hard to compare "by hand". Locating which annotations should be compared to which other one, is already some work in itself. Let's take an example:

Data points 3713 to 3716 are for the same mimicking sequence, the one we want to use is 3713 (modifier: 1M7, FMN: 0uM), the field MAPseq:ID:2426173-a gives us the base one (2426173, without dash a), the data for that sequence is located at data points 1-4, and there, we want to use the number 2 (modifier: 1M7, FMN: 200uM)

I'm trying to figure out how to be systematic with this, and haven't come up with much yet. But while doing this, I already found one comparison that seems to relate to what I was saying about false positives. Consider JMF's LaJ Solve http://eterna.cmu.edu/game/solution/2426174/2426905/seeresult/ From a visual inspection, it seems to me clear that the sequence folded into the unbound shape, and that the addition of FMN resulted in simply nothing, the molecule just didn't budge. Now, comparing data sets 3793 to 1606 should convince you that the mimic was strong enough to actually change things...

-- ElNando888 (talk) 08:55, 21 February 2014 (UTC)Reply[reply]

 

----

 

Well, who ever said the job of finding a mimic would be easy? :)

The potential for false positives can exist in three different contexts.

  1. Mutations to the binding site affect the fold of the secondary structure globally, preventing or allowing for the presence of suboptimal structures that differ from the WT sequence.
  2. The free energy contribution of the mimic is not equivalent to the free energy contribution of FMN at a 200uM concentration.
  3. Tertiary structure differs significantly between the way the mimic folds, and the way FMN folds, potentially altering the global structure (and resultign SHAPE signals)

 

With these factors in mind, our goals for this project remain the same

  • Development of a sequence that can potentially mimic the binding of a molecule in a multiple state RNA system.
  • Development of structures and sequences that allow for the succesful design of riboswitches.
  • Development of a protocol to implement these features.

We could also add a new goal as well, if you wish

  • Determine if the use of a mimic is possible, and if it is not, write a detailed proposal for the Das lab that explains why they need to implement a pipeline for riboswitches, in their high throughput, synthesis protocol.

---Brourd (talk) 15:36, 21 February 2014 (UTC)Reply[reply]

 

----

 

Ok, I started this manually:

https://docs.google.com/spreadsheet/ccc?key=0AsEEBMO3fRaUdDJpbFF2azdMM3g2Y3ZnY0lEY21KNEE#gid=0

Probably gonna take me a while to put them all in there, but I think it will be worth the effort.

 

And agreed on all you just said.

-- ElNando888 (talk) 17:24, 21 February 2014 (UTC)Reply[reply]

 

----

 

I started a google doc collecting the data. It contains three rows for each design:

  • The original design without FMN
  • The original design with 200uM FMN
  • The mimic design

The first sheet "Annotated," is just all of the original data. In the second sheet, "Reactivity Difference," I calculated the differences between:

  • The original design without FMN and the original design with 200uM FMN
  • The original design without FMN and the mimic

I started to add some graphs of the calculated differences. After generating a few of these, I thought it might be helpful to note where the designs were supposed to switch, so I attempted to do this on the first two. I defined the FMN-bound state as the mimic state, because it was easy for me to do, and I don't know the reactivity pattern for the FMN-bound section of RNA. If you think adding this to the graph is useful (or have an idea of a better way to visualize it), let me know. 

The doc is here:

https://docs.google.com/spreadsheet/ccc?key=0AppiCUq-Rq1tdHl5MUpRSmlscVVzMFJCVGNXb3hjbFE#gid=2

Meechl (talk) 03:25, 23 February 2014 (UTC) Reply[reply]

 

----

 

Hi Meechl, and these are great news :)

You will need to fix the sharing of the document though, as of now, it seems to be private and only you can view the content.

-- ElNando888 (talk) 13:30, 23 February 2014 (UTC)Reply[reply]

 

----

 

Nice Work Meechl!

--Brourd (talk) 18:52, 23 February 2014 (UTC)Reply[reply]

 

----

 

So, I was working on adding the switch trends to the graphs when I noticed the data wasn't making sense... then I noticed I had mislabeled the rows on the sheet with the graphs. D:

I fixed the problem in this NEW document. It uses the new version of speadsheets and has been working faster than the old page. I'll delete the sheet with the graphs on the original document eventually to avoid confusion. The sharing on the new document should be set so that anyone can view and comment. I can give you ability to edit too if you want.

https://docs.google.com/spreadsheets/d/1K2Zp-75Im-34U78f0-zv-HG5kro1RsKDEzm6SLp8S7E/edit?usp=sharing

Meechl (talk) 03:20, 24 February 2014 (UTC)Reply[reply]

 

----

 

== Recommended readings ==

To be honest, I'm not sure what to recommend here. Though I want to mention a publication that I find quite enlightening:

 

Giulio Quarta, Ken Sin and Tamar Schlick

Dynamic Energy Landscapes of Riboswitches Help Interpret Conformational Rearrangements and Function

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3280964/

PLoS Comput Biol. 2012 February; doi:  <a href="http://dx.doi.org/10.1371%2Fjournal.pcbi.1002368" target="pmc_ext">10.1371/journal.pcbi.1002368</a>

 

Granted, it's... quite a lot to gobble for the occasional RNA-toying amateur...

But if you manage to get past the biotechnical mumbo-jumbo, it gives a lot of insights about what's going on in cells, and how those riboswitches appear to work... I find specially interesting the classification between kinetically-driven and thermodynamically-driven switches. If I understand our pipeline correctly, we're probably trying to create switches of the latter class. But I wonder if applying their methods, which would in theory ensure the creation of functional switches in vivo, would not be actually the smart thing to do for us, since we would acquire knowledge and expertise that would be applicable not only in vitro.

Also, the computational methods are actually accessible (not a 5 minutes job, but still), which means that I could comtemplate the idea of applying these methods (with some adaptations) in my bot for instance. A long term plan though.

-- ElNando888 (talk) 18:35, 21 February 2014 (UTC)Reply[reply]

 

----

 

Another suggestion would be to check out this page:

http://2011.igem.org/Team:Peking_R/Project/RNAToolkit2

This is not a scientific paper, and it's not even about FMN, but it illustrates how some scientists go about using what they call "1nt slipping mechanism" with a good number of non-canonical pairs to engineer a switch. I believe something similar could very well be used with FMN as well.

-- ElNando888 (talk) 00:24, 23 February 2014 (UTC)Reply[reply]

 

----

 

I'll be adding a few pointers here, as I find them:

 

-- ElNando888 (talk) 08:13, 24 February 2014 (UTC)Reply[reply]

 

----

 

Brourd and I already talked a few times about this, but I think it may be important to mention it here, so that the other members of the group are informed.

There is a potential issue about FMN that we seem to ignore at EteRNA. Some 17 years ago, a german team of scientists described a phenomenon called "photocleavage", when RNAs (with GU pairs inside stacks) are in the presence of FMN and... light. And by light, I mean it doesn't need to be a strong source, a 20W halogen lamp at about 30 cm is enough to cause the reaction.

If I'm not mistaken, their first paper was http://pubs.acs.org/doi/abs/10.1021/ja962918p but as you can see, it is paywalled. Still, the supplementary informations located at http://cdn-pubs.acs.org/doi/suppl/10.1021/ja962918p/suppl_file/ja1137.pdf provide quite a number of very interesting results.

Later, they published http://nar.oxfordjournals.org/content/25/20/4018.long which goes into much more details of this reaction.

 

What does that mean for EteRNA, and of course, for us? To be honest, I have no idea, but considering the potential for disturbances that this reaction could cause, I would be glad if someone would ask the scientists what is done to take this factor into account, or to prevent it. Maybe the next lab meeting would be a good opportunity for that. Because if they're not doing anything about it, we'd probably better avoid GU pairs in our designs...

-- ElNando888 (talk) 18:36, 27 February 2014 (UTC)Reply[reply]

 

----

 

== Targets ==

I've been thinking about targets and their design rules. Regarding the evaluation of sequences in the lab, I think it would be good to have the binding site "flipping" as much as possible.

 

<tbody> </tbody>
  NAGGAUAU AGAAGGN
FMN bound xxxx.xxx xxxxxxx
FMN not bound xx...xxx xx.xxxx
no FMN ?????... ?.....?

 

The idea is to be able to get a rough idea of how much of the bound shape occurred when no FMN had been added, as well as get an estimate how well the switch occurred when FMN was present.

Thoughts?

-- ElNando888 (talk) 02:03, 22 February 2014 (UTC)Reply[reply]

 

----

 

Indeed, observing the SHAPE signals for these specific residues will probably be one of the more specific ways to determine if the binding site formed.

I think in that same forum post I linked about about repeatability of riboswitch scoring, that, it was observed in several of the cloud labs, that the binding site residues in the first state, were protected from the SHAPE probe, when they should have been exposed. One explanation I gave, was that the FMN ready state was forming before FMN was introduced into the solution.

 

--Brourd (talk) 16:24, 22 February 2014 (UTC) Reply[reply]

 

----

 

This is something I don't think was ever tested in EteRNA

Inverted FMN binding site.png

An inverted binding site, with AGAAGGN residing 5' of NAGGAUAU.

So far, I see no logical reasons why it wouldn't work just as well. Or if it doesn't, I'd like to know why...  Thoughts?

 

And while we're on this topic, the scientific papers I've read do not indicate a preference for the closing base pair. This closing UA pair could apparently be mutated to any canonical or GU wobble pair. Do we want to test that as well?

 

-- ElNando888 (talk) 12:22, 22 February 2014 (UTC)Reply[reply]

 

----

 

So, part of the reason I am currently running http://eterna.cmu.edu/web/lab/3376174/ is due to this very theory. While I am not entirely convinced that this inverted binding site works, I am certainly willing to allow it to be tested if the SHAPE signals for the inverted loop, are similar to that of the normal FMN aptamer, when FMN is not present.

As for testing alternative, closing base pairs, that could very well be done, and, if I get word back on an experimental protocol that could potentially allow us to test the riboswitch constructs in the solution with FMN, it would certainly be easy to do.

--Brourd (talk) 16:24, 22 February 2014 (UTC)Reply[reply]

 

----

 

I've been looking at the puzzles I generated in my SCLT-NG series. So far, the one I'd rather have as a first target for the upcoming lab would be one of these:

  • SCLT-NG 10: looks fairly simple, yet seems like a possibly very interesting candidate, 5 clear switching segments... could also be shortened a bit (the non-switching neck area doesn't really need to stay like that)
  • SCLT-NG 11: a few switching segments, a long sliding stack... the near lack of switching bases in the 53-73 area is a bit unsatisfactory though.
  • SCLT-NG 13: looks very good
  • SCLT-NG 15: looks acceptable to me, 4 switching segments
  • SCLT-NG 21: so far my favorite, large spread of switching bases in all areas, numerous switching groups, a multiloop in the unbound structure, looks great to me
  • SCLT-NG 35: also good IMO, but the unpaired area 68-86 in the unbound structure looks a little scary...
  • SCLT-NG 39: if we're tempted by the feeling of "Mission Impossible", I'd say, that's the one to go with :D

 

I neglected to mention this one, but maybe I shouldn't:

  • SCLT-NG 4: essentially, 2 very large switching segments

This one is a little "special", as it is a sort of personal "subproject" in the field of riboswitches. Maybe we will have the opportunity to talk about it later, and even to test the idea. At this point, it seems less important than other goals we have in mind.

Nevertheless, this particular puzzle seems to present interesting qualities. Large switching areas that would be easy to score, even just visually, great simplicity...

 

-- ElNando888 (talk) 06:32, 10 March 2014 (UTC)Reply[reply]

 

----

 

My suggestion would be to ask the players which switch they would like to solve. Since any and all of these may make excellent switch mimic pilot candidates.

In addition, I have been thinking of exapnding the number of mimic sequences to test from 4, to 6 or 7.

 

--Brourd (talk) 23:11, 10 March 2014 (UTC)Reply[reply]

 

----

 

== Riboswitch Testing via RNA Arrays ==

 

I am currently working on bringing a new method for riboswitch analysis into this project, using a method that Dr. Johan Andreasson describes in the January 9th Monthly EteRNA Meeting with the Das Lab.

If we are successful with this, we should have the ability to independently assess the effectiveness of rinboswitches, ligand mimics, potential mutations to the structure or sequence of the FMN binding site, and the ability for SHAPE to be used in the determination of successful riboswitches.

 

--Brourd (talk) 18:52, 23 February 2014 (UTC)Reply[reply]

 

----

 

== Questions about the project ==

 

Switch work in the lab

Maybe a dumb question:

"making a round with exclusively switching sequences is basically twice the work for the lab..."

Why is that - do both states have to be made, then run for shape verification, or do they need a chemical trigger to switch, or why is this exatly twice the work>

Salish99 (talk) 19:59, 24 February 2014 (UTC)Reply[reply]

 

----

 

Answer: In order for the Das lab to probe a sequence with two conditions (With FMN and without FMN, in the case of the riboswitch labs), they need two different sets of sequences. This means, that they need run the experimental protocol on both sets, meaning double the work, double the concentration, and double the tracking. Essentially, it is not quite as easy a task for their experimental team as a single set of sequences is.

--Brourd (talk) 22:05, 24 February 2014 (UTC)Reply[reply]

 

----

 

== Switch scoring in the context of mimics ==

<tbody> </tbody>
Mimic pilot scoring 1.png Mimic pilot scoring 2.png

The scoring model I envision for the newly proposed lab would be like following:

Nomenclature:

  • shape_U = SHAPE value in the unbound structure
  • shape_B = same for the bound structure
  • T = threshold (standardized to 0.5 now)

There are 21 switching bases in this puzzle.

For the bases going from unpaired to paired (example base 6)

  • 2 points, if shape_U > T > shape_B
  • 1 point, if shape_U > shape_B

For the bases going from paired to unpaired (example base 24), the reverse

  • 2 points, if shape_U < T < shape_B
  • 1 point, if shape_U < shape_B

Finally, because we don't know yet how the mimic will behave SHAPE-wise, we have to exclude those bases from the scoring, but I would argue that I see at least 2 bases where we could do something. So, for bases 17 & 60

  • 1 point, if shape_U > T

The rationale behind this last scoring rule is that if the molecule tends to form the bound structure even in the absence of ligand, then these bases would tend to show protection. So, rewarding the fact that they are generally reactive in the absence of FMN seems to be a good thing to do.

 

Total 44 possible points, which would be scaled up to 100.

Thoughts?

-- ElNando888 (talk) 11:57, 9 March 2014 (UTC)Reply[reply]

 

----

 

Trying to see if I'm getting correctly the things Brourd and I talked about online.

  • ShU = SHAPE value in the unbound structure
  • ShB = same for the bound structure
  • ThP = threshold under which a SHAPE value is considered protected (0.25?)
  • ThR = threshold over which a SHAPE value is regarded as reactive (0.5?)

 

<tbody> </tbody>
Unbound →

Bound ↓
Paired
(not closing) 
Paired
(closing) 
Unpaired
(mismatch) 
Unpaired
(not mismatch) 
Paired
(not closing) 
if (ShB < ThP) then MinReward
if (ShU < ThP) then MinReward 
   

if (ShU > ThR) && (ShB < ThP) then MaxReward

else if (ShU - ShB > ThR - ThP) then ReducedReward

Paired
(closing) 
       
Unpaired
(mismatch)
       
Unpaired
(not mismatch)

if (ShB > ThR) && (ShU < ThP) then MaxReward

else if (ShB - ShU > ThR - ThP) then ReducedReward 

    if (ShB > ThR) then MinReward
if (ShU > ThR) then MinReward 

 

I'd propose to make:

  • ReducedReward = 3 x MinReward
  • MaxReward = 5 x MinReward

 

This first table would apply to all bases, excluding the binding site. Another table would be needed for the bases in the binding site itself.

 

Open points/questions:

  • what would you suggest as formulas in the empty cells above?
  • how do we deal with 1 unbound + 4 bound mimics?

 

-- ElNando888 (talk) 07:04, 10 March 2014 (UTC)Reply[reply]

 

----

So then, the threshold under which a residue is considered protected is 0.5.

 

The threshold over which a residue is considered reactive is .25

As for the empty cells of the scoring protocol...

The SHAPE signal of both mismatches and closing base pairs can vary wildly, based on the sequence. Therefore, we will need special rules that exempt them from our normal scoring terms, to prevent the potential loss of successful switch designs.

What these special rules are, I don't know (sorry!), however, the continued observance of multistate RNA sequences should allow us to expand upon our knowledge of where SHAPE signals are likely to vary.

 

--Brourd (talk) 23:11, 10 March 2014 (UTC)Reply[reply]

 

----