User:Dennis9600: Difference between revisions

From Eterna Wiki

No edit summary
No edit summary
Line 1: Line 1:
<p><strong><img title="Smile" src="/wiki/extensions/TinyMCE_MW/jscripts/tiny_mce/plugins/emotions/img/smiley-smile.gif" border="0" alt="Smile" /><span style="font-size: medium;">The lab entitled "A Codon Riboswitch" has multiple sub-labs, sub-projects (or whatever the correct terminology is).&nbsp; My analysis script now navigates the heirarchy and generates outputs for all of them.&nbsp;&nbsp; I think so, anyway. <img title="Wink" src="/wiki/extensions/TinyMCE_MW/jscripts/tiny_mce/plugins/emotions/img/smiley-wink.gif" border="0" alt="Wink" /></span></strong></p>
<p>&nbsp;</p>
<p><strong><span style="font-size: medium;">I have updated the statistical forecasting tool which predicts synthesis scores of lab designs in the active labs based on a factors analysis of all past labs.&nbsp; A new factor, ensemble diversity, has been added and the weights of the other factors have changed a little.&nbsp; The predictive power of the tool as measured by correlation with past syn scores and standard deviation of the prediction error has improved.&nbsp; (There is still quite a bit of room for improvement though. Some of the worst outliers are laughably wrong.)</span></strong></p>
<p>&nbsp;</p>
<p><img title="Smile" src="/wiki/extensions/TinyMCE_MW/jscripts/tiny_mce/plugins/emotions/img/smiley-smile.gif" border="0" alt="Smile" /><span style="font-size: medium;">The lab entitled "A Codon Riboswitch" has multiple sub-labs, sub-projects (or whatever the correct terminology is).&nbsp; My analysis script now navigates the heirarchy and generates outputs for all of them.&nbsp;&nbsp; I think so, anyway. <img title="Wink" src="/wiki/extensions/TinyMCE_MW/jscripts/tiny_mce/plugins/emotions/img/smiley-wink.gif" border="0" alt="Wink" /></span></p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p><span style="font-size: medium;">I have&nbsp; added Vienna 2.1.1 dot plots to my publications.&nbsp; I wanted to incorporate them into the report files, but it was much more convenient to just make a separate file for each submitted design</span>.<span style="font-size: medium;">&nbsp; Giving each one a unique file name that was acceptable to Python, the Windows operating system, Ghostscript, and Google Drive required a little name mangling</span>.&nbsp; <span style="font-size: medium;">There are a huge number of files this time.&nbsp; I haven't had time to look at everything I uploaded to the ReportsActiveLabs folder.&nbsp; Please PM me if you spot any problems.</span></p>
<p><span style="font-size: medium;">I have&nbsp; added Vienna 2.1.1 dot plots to my publications.&nbsp; I wanted to incorporate them into the report files, but it was much more convenient to just make a separate file for each submitted design</span>.<span style="font-size: medium;">&nbsp; Giving each one a unique file name that was acceptable to Python, the Windows operating system, Ghostscript, and Google Drive required a little name mangling</span>.&nbsp; <span style="font-size: medium;">There are a huge number of files this time.&nbsp; I haven't had time to look at everything I uploaded to the ReportsActiveLabs folder.&nbsp; Please PM me if you spot any problems.</span></p>
Line 6: Line 9:
<p>&nbsp;</p>
<p>&nbsp;</p>
<p><span style="font-size: small;"><strong>In this edition of my analysis reports and spreadsheets (new items are bolded):</strong></span></p>
<p><span style="font-size: small;"><strong>In this edition of my analysis reports and spreadsheets (new items are bolded):</strong></span></p>
<p><span style="font-size: small;"><strong>&nbsp; 1) Every lab has it's own subfolder now</strong></span></p>
<p><span style="font-size: small;">&nbsp; 1) Every lab has it's own subfolder now</span></p>
<p><span style="font-size: small;"><strong>&nbsp; 2) Vienna 2.1.1 dot plots for each submitted design!!!<br /></strong></span></p>
<p><span style="font-size: small;">&nbsp; 2) Vienna 2.1.1 dot plots for each submitted design!!!<strong><br /></strong></span></p>
<p><span style="font-size: small;">&nbsp; 3) If the target structure contains locked, non-Canonical pairs, the Vienna tools are called with a --nsp option that allows the pairing and assigns an energy of 0 to it.&nbsp; Zero may not</span></p>
<p><span style="font-size: small;">&nbsp; 3) If the target structure contains locked, non-Canonical pairs, the Vienna tools are called with a --nsp option that allows the pairing and assigns an energy of 0 to it.&nbsp; Zero may not</span></p>
<p><span style="font-size: small;">be the best value to use, but it seems to be better than not doing anything.<br /></span></p>
<p><span style="font-size: small;">be the best value to use, but it seems to be better than not doing anything.&nbsp; Please PM me if you find a better way to treat non-canonical pairs...<br /></span></p>
<p><span style="font-size: small;">&nbsp; 4) Some of the labs now have a starting sequence of 'GG' instead of 'GGAAA'.&nbsp; It looks like the devs have left things open for additional "tails" to be used in future labs.&nbsp; For now, my script</span></p>
<p><span style="font-size: small;">&nbsp; 4) Some of the labs now have a starting sequence of 'GG' instead of 'GGAAA'.&nbsp; It looks like the devs have left things open for additional "tails" to be used in future labs.&nbsp; For now, my script</span></p>
<p><span style="font-size: small;">recognizes both of the sequences that have appeared and uses the correct one for each lab.<strong><br /></strong></span></p>
<p><span style="font-size: small;">recognizes both of the sequences that have appeared and uses the correct one for each lab.<strong><br /></strong></span></p>
<p><span style="font-size: small;">&nbsp; 5)&nbsp; There is a field for the Vienna 2.1.1 melt point of each design submitted.&nbsp; This new field now appears in both the text reports and the spreadsheets.<br /></span></p>
<p><span style="font-size: small;">&nbsp; 5)&nbsp; There is a field for the Vienna 2.1.1 melt point of each design submitted.&nbsp; This new field now appears in both the text reports and the spreadsheets.<br /></span></p>
<p><span style="font-size: small;">My lab tool includes a forecasting tool that looks at factors that have correlated well with past synthesis scores.&nbsp; The current version of the forecasting tool looks at the following factors:</span></p>
<p>&nbsp;</p>
<p><span style="font-size: small;">6) My lab tool includes a forecasting tool that looks at factors that have correlated well with past synthesis scores.&nbsp; The current version of the forecasting tool looks at the following factors:</span></p>
<p><span style="font-size: small;">&nbsp;&nbsp;&nbsp; a) Whether or not the design folded correctly in EteRNA's energy model (Vienna 1.8.5)</span></p>
<p><span style="font-size: small;">&nbsp;&nbsp;&nbsp; a) Whether or not the design folded correctly in EteRNA's energy model (Vienna 1.8.5)</span></p>
<p><span style="font-size: small;">&nbsp;&nbsp;&nbsp; b) Whether or not the design folded correctly in the Vienna 2.1.1 Energy Model</span></p>
<p><span style="font-size: small;">&nbsp;&nbsp;&nbsp; b) Whether or not the design folded correctly in the Vienna 2.1.1 Energy Model</span></p>
<p><span style="font-size: small;">&nbsp;&nbsp;&nbsp; c) The melt point of the design (as reported by the EteRNA server)</span></p>
<p><span style="font-size: small;">&nbsp;&nbsp;&nbsp; c) The frequency of the design in the Vienna 2.1.1 MFE ensemble.</span></p>
<p><span style="font-size: small;">&nbsp;&nbsp;&nbsp; d) The percentages of C,U, and G in the design, and how far they differ from 13, 10, and 21% respectively (the Berex Strategy).</span></p>
<p><span style="font-size: small;"><strong>&nbsp;&nbsp;&nbsp; d) The diversity of the MFE ensemble. &nbsp; This factor is given the highest weight.</strong><br /></span></p>
<p><span style="font-size: small;">&nbsp;&nbsp;&nbsp; e) log<sub>10</sub>(designer's EteRNA points).&nbsp; This factor is given a very low weight.<br /></span></p>
<p><span style="font-size: small;">&nbsp;&nbsp;&nbsp; e) The melt point of the design (as reported by the EteRNA server)</span></p>
<p><span style="font-size: small;">&nbsp;&nbsp;&nbsp; f) The percentages of C,U, and G in the design, and how far they differ from 13, 10, and 21% respectively (the Berex Strategy).</span></p>
<p><span style="font-size: small;">&nbsp;&nbsp;&nbsp; g) log<sub>10</sub>(designer's EteRNA points).&nbsp; This factor is given a very low weight.<br /></span></p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p><span style="font-size: small;"><strong>Things I have looked at:</strong><br /></span></p>
<p><span style="font-size: small;"><strong>Things I have looked at:</strong><br /></span></p>
Line 27: Line 33:
<p><span style="font-size: large;">1. I am (still) thinking about how to add Vienna 2.1.1 melt curves&nbsp; to these reports.&nbsp;&nbsp;</span></p>
<p><span style="font-size: large;">1. I am (still) thinking about how to add Vienna 2.1.1 melt curves&nbsp; to these reports.&nbsp;&nbsp;</span></p>
<p><span style="font-size: large;">2. Investigate what changes EteRNA made to Vienna 1.8.5 dot plots.</span></p>
<p><span style="font-size: large;">2. Investigate what changes EteRNA made to Vienna 1.8.5 dot plots.</span></p>
<p><span style="font-size: large;">2. As of 11/10/2013, about three new round of synthesis results are available.&nbsp; I need to run the training portion of my forecasting tool on the new data.</span></p>
<p><span style="font-size: large;">3. Look at "upgrading" to Vienna 2.1.3 (from Vienna 2.1.1) in my toolset.<br /></span></p>
<p><span style="font-size: large;">3. Investigate Vienna's "ensemble diversity measure" as another possible factor for the forecasting tool to consider.&nbsp; I'm capturing the diversity measure now,</span></p>
<p><span style="font-size: large;">but not doing anything with it yet.<br /></span></p>
<p><span style="font-size: large;">4. Look for published energy parameters for non-Canonical pairs in RNA sequences.&nbsp; There's got to be something better than 0 out there.</span></p>
<p><span style="font-size: large;">5. Look at "upgrading" to Vienna 2.1.3 (from Vienna 2.1.1) in my toolset.<br /></span></p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p><span style="font-size: large;">Please let me know if you have any other things you would like to see in my lab reports.</span></p>
<p><span style="font-size: large;">Please let me know if you have any other things you would like to see in my lab reports.</span></p>

Revision as of 01:19, 1 December 2013

 

I have updated the statistical forecasting tool which predicts synthesis scores of lab designs in the active labs based on a factors analysis of all past labs.  A new factor, ensemble diversity, has been added and the weights of the other factors have changed a little.  The predictive power of the tool as measured by correlation with past syn scores and standard deviation of the prediction error has improved.  (There is still quite a bit of room for improvement though. Some of the worst outliers are laughably wrong.)

 

<img title="Smile" src="/wiki/extensions/TinyMCE_MW/jscripts/tiny_mce/plugins/emotions/img/smiley-smile.gif" border="0" alt="Smile" />The lab entitled "A Codon Riboswitch" has multiple sub-labs, sub-projects (or whatever the correct terminology is).  My analysis script now navigates the heirarchy and generates outputs for all of them.   I think so, anyway. <img title="Wink" src="/wiki/extensions/TinyMCE_MW/jscripts/tiny_mce/plugins/emotions/img/smiley-wink.gif" border="0" alt="Wink" />

 

I have  added Vienna 2.1.1 dot plots to my publications.  I wanted to incorporate them into the report files, but it was much more convenient to just make a separate file for each submitted design.  Giving each one a unique file name that was acceptable to Python, the Windows operating system, Ghostscript, and Google Drive required a little name manglingThere are a huge number of files this time.  I haven't had time to look at everything I uploaded to the ReportsActiveLabs folder.  Please PM me if you spot any problems.

 

Analysis Reports and Spreadsheets produced by my Python script can be found in <a href="https://drive.google.com/folderview?id=0B-rDnoMjFSH2cDFibWt6T19nRzg&usp=sharing">ReportsActiveLabs</a>.  You can bookmark this folder in your browser.  I have decided to stick with one folder rather than change with each publication cycle. 

 

In this edition of my analysis reports and spreadsheets (new items are bolded):

  1) Every lab has it's own subfolder now

  2) Vienna 2.1.1 dot plots for each submitted design!!!

  3) If the target structure contains locked, non-Canonical pairs, the Vienna tools are called with a --nsp option that allows the pairing and assigns an energy of 0 to it.  Zero may not

be the best value to use, but it seems to be better than not doing anything.  Please PM me if you find a better way to treat non-canonical pairs...

  4) Some of the labs now have a starting sequence of 'GG' instead of 'GGAAA'.  It looks like the devs have left things open for additional "tails" to be used in future labs.  For now, my script

recognizes both of the sequences that have appeared and uses the correct one for each lab.

  5)  There is a field for the Vienna 2.1.1 melt point of each design submitted.  This new field now appears in both the text reports and the spreadsheets.

 

6) My lab tool includes a forecasting tool that looks at factors that have correlated well with past synthesis scores.  The current version of the forecasting tool looks at the following factors:

    a) Whether or not the design folded correctly in EteRNA's energy model (Vienna 1.8.5)

    b) Whether or not the design folded correctly in the Vienna 2.1.1 Energy Model

    c) The frequency of the design in the Vienna 2.1.1 MFE ensemble.

    d) The diversity of the MFE ensemble.   This factor is given the highest weight.

    e) The melt point of the design (as reported by the EteRNA server)

    f) The percentages of C,U, and G in the design, and how far they differ from 13, 10, and 21% respectively (the Berex Strategy).

    g) log10(designer's EteRNA points).  This factor is given a very low weight.

 

Things I have looked at:

I have also looked at the temperature setting of the energy model as a possible factors for my forecasting tool.  I was suprised to find that the default setting of 37C is the best setting for my forecasting tool.

 

 

Things to do:

1. I am (still) thinking about how to add Vienna 2.1.1 melt curves  to these reports.  

2. Investigate what changes EteRNA made to Vienna 1.8.5 dot plots.

3. Look at "upgrading" to Vienna 2.1.3 (from Vienna 2.1.1) in my toolset.

 

Please let me know if you have any other things you would like to see in my lab reports.

 

Happy folding,

---Dennis9600