User:Omei/Cloud Lab Data Mining Tool: Difference between revisions

From Eterna Wiki
(Moved thoughts on data accessibility to new page)
(changed new page to User namespace)
Line 9: Line 9:
</ol>
</ol>
<p><span style="font-size: small;">As for my approach to step 2, I recently wrote up a short <a href="https://docs.google.com/document/d/1rFJSsYaCn1ZP1DnZ8fDGUiUdtt2FUV9sTrSiS4qudck/edit">preview article</a>.&nbsp; If you have any comments, you should be able to make them in that document.</span></p>
<p><span style="font-size: small;">As for my approach to step 2, I recently wrote up a short <a href="https://docs.google.com/document/d/1rFJSsYaCn1ZP1DnZ8fDGUiUdtt2FUV9sTrSiS4qudck/edit">preview article</a>.&nbsp; If you have any comments, you should be able to make them in that document.</span></p>
<p><span style="font-size: small;">The tool is written in Javascript.&nbsp; Because it needs to make RESTful queries to the EteRNA domain, I'm currently using GreaseMonkey to get around cross domain origin restrictions.&nbsp; But this is becoming awkward as I make beta versions available to other players.&nbsp; I'm collecting ideas on</span><span style="font-size: small;"> </span><span style="font-size: small;">[[Making Cloud Lab Data Accessible|what to do]]</span><span style="font-size: small;"> </span><span style="font-size: small;">&nbsp;</span><span style="font-size: small;">about this.</span></p>
<p><span style="font-size: small;">The tool is written in Javascript.&nbsp; Because it needs to make RESTful queries to the EteRNA domain, I'm currently using GreaseMonkey to get around cross domain origin restrictions.&nbsp; But this is becoming awkward as I make beta versions available to other players.&nbsp; I'm collecting ideas on</span><span style="font-size: small;"> </span><span style="font-size: small;">[[User:Omei/Making Cloud Lab Data Accessible|what to do]]</span><span style="font-size: small;"> </span><span style="font-size: small;">&nbsp;</span><span style="font-size: small;">about this.</span></p>
<p><br /><span style="font-size: small;">In the meantime, if there are any user/developers who know how to set up and use Greasemonkey without a lot of support on my part, I would be happy to share a snapshot of my current work in progress.&nbsp; Just PM&nbsp;<span id="docs-internal-guid-30fd4901-aadf-f984-20a8-2af748d7e1b8"><span>[http://eterna.cmu.edu/web/player/57675/ me] or download the most recent version of the Database Mining Tool .zip file from <a href="https://drive.google.com/folderview?id=0Bzf0qUriSfzWUmlvTU5mUTl0Rlk&amp;usp=sharing">https://drive.google.com/folderview?id=0Bzf0qUriSfzWUmlvTU5mUTl0Rlk&amp;usp=sharing</a>.</span></span></span></p>
<p><br /><span style="font-size: small;">In the meantime, if there are any user/developers who know how to set up and use Greasemonkey without a lot of support on my part, I would be happy to share a snapshot of my current work in progress.&nbsp; Just PM&nbsp;<span id="docs-internal-guid-30fd4901-aadf-f984-20a8-2af748d7e1b8"><span>[http://eterna.cmu.edu/web/player/57675/ me] or download the most recent version of the Database Mining Tool .zip file from <a href="https://drive.google.com/folderview?id=0Bzf0qUriSfzWUmlvTU5mUTl0Rlk&amp;usp=sharing">https://drive.google.com/folderview?id=0Bzf0qUriSfzWUmlvTU5mUTl0Rlk&amp;usp=sharing</a>.</span></span></span></p>
<p><span style="font-size: small;"><br /></span></p>
<p><span style="font-size: small;"><br /></span></p>

Revision as of 21:15, 18 July 2013

Cloud Lab Data Mining Tool

At this point, the biggest need I feel in EteRNA is for a way to extract meaning from the results of thousands of lab designs that have been synthesized.  Ideally, something addressing this need will be built into the EteRNA GUI.  But I decided to just see what I could do to contribute ideas and a sample implementation.

 

I envision three major major parts.

  1. A method for finding the labs that are relevant to a particular question.  As a starting point, this might take the form of searching by sequence or structure motif.  Existing examples are <a href="http://cossmos.slu.edu/search.php">CoSSMoS</a> and <a href="http://rmdb.stanford.edu/repository/advanced_search/">RMDB</a>.
  2. A method of interactively viewing the SHAPE results for all the synthesized designs in a single lab.  This is the part I am actively working on.
  3. A method for integrating the results of the previous two steps.  For example, after finding a relevent lab (step 1) and developing a hypothesis based on it (step 2), it would be nice to be able to gather up all "analogous" data from any other relevent labs, to see if the hypothesis is consistent with other labs.  This is the step I am least clear about how it should work.

As for my approach to step 2, I recently wrote up a short <a href="https://docs.google.com/document/d/1rFJSsYaCn1ZP1DnZ8fDGUiUdtt2FUV9sTrSiS4qudck/edit">preview article</a>.  If you have any comments, you should be able to make them in that document.

The tool is written in Javascript.  Because it needs to make RESTful queries to the EteRNA domain, I'm currently using GreaseMonkey to get around cross domain origin restrictions.  But this is becoming awkward as I make beta versions available to other players.  I'm collecting ideas on what to do  about this.


In the meantime, if there are any user/developers who know how to set up and use Greasemonkey without a lot of support on my part, I would be happy to share a snapshot of my current work in progress.  Just PM me or download the most recent version of the Database Mining Tool .zip file from <a href="https://drive.google.com/folderview?id=0Bzf0qUriSfzWUmlvTU5mUTl0Rlk&usp=sharing">https://drive.google.com/folderview?id=0Bzf0qUriSfzWUmlvTU5mUTl0Rlk&usp=sharing</a>.