User talk:Omei/Cloud Lab Data Mining Tool

From Eterna Wiki

Latest comment: 29 August 2013 by Omei

== Volunteer ==

Hi Omei,

I think this is indeed a serious difficulty in current Cloud Lab, and I salute your effort. I'd like to offer my help. There isn't much I can do at the code level (I'm a C/C++ developer, not a web/javascript guru), but I'm known in my professional circle for being quite talented at finding bugs, even in source code I'm unfamiliar with. And if nothing else, I'd love to beta-test, since there has been a few times where I needed a question answered, and had to manually browse the labs in search of relevant SHAPE data.

Just let me know how can I be of assistance :)

-- ElNando888 (talk) 06:02, 4 July 2013 (UTC)Reply[reply]

PS: I'm going to make minor edits to the page, not changing anything, just linking things a bit. Just revert the edit if you have objections.

----

No objections at all to you enhancing the page.  Thx.  As to skill sets, I'm not a web web/javascript guru either. As an independent software developer, I've worked in a lot of different environments, but embedded systems in C/C++ work represents a plurality, if not a majority, of my work. If you want to install the Greasemonkey plugin (easy, even if you haven't had experience with it), I'll send you the code any time you want.

Omei (talk) 17:53, 4 July 2013 (UTC)Reply[reply]

----

Just send it any time you want, and don't worry about me, I'll deal with GreaseMonkey (and any other dependencies) without bothering you more than necessary :)

-- ElNando888 (talk) 09:54, 6 July 2013 (UTC)Reply[reply]

----

Hi Omei,

I'm not sure if it would make any sense or help in any way, but have you considered the possibility of integrating part of your code in EteRNA directly? In other words, I'm asking if you are aware that the dev team have opened a door for third party code, at https://github.com/EteRNAgame/EteRNA-Script-Interface ? There are a few examples of pieces of code at https://github.com/EteRNAgame/EteRNA-Script-Interface/tree/master/Eterna/library As you can see, a few players submitted some code. The ViennaRNA ports to javascript are from me, and Justin also created the LibNando.js with a few silly snippets that I wrote earlier for the scripting interface. What I mean to say, is that you could have a LibOmei.js there too. At first, this code would only be available on the development server (eternadev.org), but if enough players "push" to have it on the production server, it could be available for all.

Well, I was just thinking that it may solve (granted, in the long run) the cross-domain problem a little more elegantly than using GreaseMonkey...

-- ElNando888 (talk) 07:30, 14 July 2013 (UTC)Reply[reply]

----

I'm completely open to having the code integrated with Eterna.  I didn't want to put off programming until I had all the answers to procedural questions, so I just got to work in an environment I knew I could be productive in.  I figured that once I had something worhwhile to show Jee, we could discuss how it might best be made more widely available.

I'm not clear on the relationship between the git repository and the development server.  Is code deposited into git automatically accessible to scripts running on the development server?  Or does someone manually vet them and then copy them to the server?  And is there a provision for user-provided html, or only javascript libraries? 

Is there a mechanism already in place where 

  1. I could easily update .html and .js files, and
  2. Users could load the html in their own browser window?  (Even with full support for HTML, I don't think the little output frame in the current scripting interface would make a satisfactory platform for the kind of report I envision.)

If so, I would probably start doing that now.  My main concern is not getting too many users too quickly, because I don't want to end up spending all my time on documentation and support instead of development.  Because of that, having it only on the development server for now would be a plus, not a negative.

Omei (talk) 20:52, 14 July 2013 (UTC)Reply[reply]

----

I can only answer a few of those questions. The git repository is not directly connected to the development server. If I recall correctly, Justin worked on a mechanism so that people (like me) can clone this repository, and run the code locally on their machine. Once I thought I had something valuable, I would send a push request, and Justin would validate the contribution, first for the repository (so that all devs are in synch), then he would eventually integrate it on the development server.

For .js code, the convention is to have files in that library folder. Smallish snippets grouped in your Lib<Yourname>.js file, and other things (like a folding library of the size of ViennaRNA) in its own file.

For the other questions, I'm afraid you're going to have to get in touch with Justin (kws4769) and/or Jee.

And I agree, the dev server sounds perfect for the stage your project is in. Another advantage I can see about using the repository: it possibly makes it easier for other devs to contribute. But it can only work if this repository fits whatever requirements your project has.

-- ElNando888 (talk) 06:08, 15 July 2013 (UTC)Reply[reply]

== Suggestion(s) ==

Hi Omei,

I finally got around taking a look at your tool. Very impressive! :)

Out of curiosity, I did a little code review. I noticed a section of the code that may cause problems, around line 883 (as of your version of 8-20-2013).

  • Since lengths of lab sequences are now dynamic, it may be better to use gLabInfo.secstruct.length-19 instead of the 63 constant.
  • If I'm not mistaken, the lab tails lengths are 5 and 20, not 6 and 20.
  • If the point is to determine whether to add tails or not, detecting the barcode in the structure may not be the wisest choice. The data pulled from the server about the lab should contain a "usetail" field, precisely for this purpose. Edit: actually, if the query is type=lab, then tails always have to be added. Maybe you should ask Rhiju or Jee.

Other than that, do you always code so cleanly even for hobby projects? I mean, I love it, it's so comfortable to read and all, but I doubt I would spend much effort on code readability and maintainability for an unpaid project. I'm way too lazy for that :P

Anyway, great job, please keep it up! :)

-- ElNando888 (talk) 10:19, 27 August 2013 (UTC)Reply[reply]

----

Thanks for the code review, Nando.  Your points are all well taken.

As for coding style, yes this is pretty typical.  If I'm writing some quick one-use code, I might well do it as one big, uncommented block of code.  But if I expect to have to live with it for more than a week, I quickly get tired of working on that kind of code, even if I wrote it myself. :-)


Now the really question is -- will you find the tool useful?

Omei (talk) 17:55, 28 August 2013 (UTC)Reply[reply]

----

Hi Omei,

Currently, I'm missing a feature. Do you think you could expand the filtering to simple formulas? For instance, (S5+S6+S7)<0.5

-- ElNando888 (talk) 11:55, 29 August 2013 (UTC)Reply[reply]

----

It is certainly possible.  Priority will depend on both estimated effort and estimated utility.  Can you elaborate on how it would be useful?  Would you really need geneal expressions, or just addition?

Also, I would expect you, of all people, to be able to do this for yourself.  I know you implied you had never used Javascript before.  But for what would be needed to add this feature, you wouldn't need to do much more than look at the reference for the Javascript string functions and then code like you were writing in C.  And there's essentially no tool chain to learn.  Either use Chrome as it comes out of the box or Firefox with the Firebug extension.

Omei (talk) 23:02, 29 August 2013 (UTC)Reply[reply]

----