User:ElNando888/WikiGetSat/Ideas/Sequences sorting in labs: Difference between revisions
< User:ElNando888 | WikiGetSat | Ideas
m (Rollback pre-spam) |
m (Formatting fix) |
||
Line 1: | Line 1: | ||
<p> | <p>Bottom-level, the discussion.</p> | ||
<p>Compare with https://getsatisfaction.com/eternagame/topics/sequences_sorting_in_labs</p> | |||
<p>One beautiful feature would be the possibility to wikify those contents.</p> | |||
<p> </p> | |||
<p>----</p> | |||
<p>If I'm not mistaken, [[sequence]]s in [[lab]]s can be sorted, and the algorithm currently in use seems to be the <a rel="nofollow" href="http://en.wikipedia.org/wiki/Hamming_distance">Hamming distance</a>.</p> | |||
<p>I'd like to propose a new sorting algorithm (which I dubbed “LDq9”), based on a <a rel="nofollow" href="http://en.wikipedia.org/wiki/Lee_distance">Lee distance</a> metric with a pseudo-alphabet of size 9 (or more). An example mapping would be:</p> | |||
<p><code>{{ntA}}.{{ntG}}.<span>{{AlignNt|-}}</span>.<span>{{AlignNt|-}}</span>.{{ntU}}.{{ntC}}.<span>{{AlignNt|-}}</span>.<span>{{AlignNt|-}}</span>.<span>{{AlignNt|-}}</span><br />{{AlignNt|0}}.<span>{{AlignNt|1}}</span>.<span>{{AlignNt|2}}</span>.<span>{{AlignNt|3}}</span>.<span>{{AlignNt|4}}</span>.<span>{{AlignNt|5}}</span>.<span>{{AlignNt|6}}</span>.<span>{{AlignNt|7}}</span>.<span>{{AlignNt|8}}</span></code></p> | |||
<p>Which would result in following specific distances:</p> | |||
<p>A:G = 1 <br /> U:C = 1</p> | |||
<p>G:C = 4 <br /> G:U = 3 <br /> A:U = 4 <br /> A:C = 4</p> | |||
<p>The basic idea simply being that, changes within the same [[nucleotide]] classes ([[purine]]s or [[pyrimidine]]s) represent a short distance, while a change of class represent a larger jump.</p> | |||
<p>I believe that this would give a somewhat better view of the similarity of sequences, specially in the context of [[switch]]es.</p> | |||
<p>-- [[User:ElNando888|ElNando888]]</p> | |||
<p>----</p> | |||
<p>Nice Idea.</p> | |||
<p>-- [[User:jandersonlee|jandersonlee]]</p> | |||
<p>----</p> | |||
<p>Hmm. Suppose I change a [[GC Pair|GC bond]] to CG. How does that get scored? And should it differ if it's a switch lab or not?</p> | |||
<p>-- [[User:jandersonlee|jandersonlee]]</p> | |||
<p>----</p> | |||
<p>GC to CG would be a +8 step. <br /> I don't think the metric should change between static labs and switch ones, but this idea of mine may prove making little difference with a simple Hamming in the case of static target structures. <br /> For switches, I'm almost convinced that this sorting would be a lot more accurate.</p> | |||
<p>-- [[User:ElNando888|ElNando888]]</p> | |||
<p>----</p> | |||
<p>Worth a try if the coding isn't too much.</p> | |||
<p>-- [[User:eternacac|eternacac]]</p> | |||
<p>----</p> |
Revision as of 19:29, 12 June 2018
Bottom-level, the discussion.
Compare with https://getsatisfaction.com/eternagame/topics/sequences_sorting_in_labs
One beautiful feature would be the possibility to wikify those contents.
----
If I'm not mistaken, sequences in labs can be sorted, and the algorithm currently in use seems to be the <a rel="nofollow" href="http://en.wikipedia.org/wiki/Hamming_distance">Hamming distance</a>.
I'd like to propose a new sorting algorithm (which I dubbed “LDq9”), based on a <a rel="nofollow" href="http://en.wikipedia.org/wiki/Lee_distance">Lee distance</a> metric with a pseudo-alphabet of size 9 (or more). An example mapping would be:
A.G.-.-.U.C.-.-.-
0.1.2.3.4.5.6.7.8
Which would result in following specific distances:
A:G = 1
U:C = 1
G:C = 4
G:U = 3
A:U = 4
A:C = 4
The basic idea simply being that, changes within the same nucleotide classes (purines or pyrimidines) represent a short distance, while a change of class represent a larger jump.
I believe that this would give a somewhat better view of the similarity of sequences, specially in the context of switches.
-- ElNando888
----
Nice Idea.
-- jandersonlee
----
Hmm. Suppose I change a GC bond to CG. How does that get scored? And should it differ if it's a switch lab or not?
-- jandersonlee
----
GC to CG would be a +8 step.
I don't think the metric should change between static labs and switch ones, but this idea of mine may prove making little difference with a simple Hamming in the case of static target structures.
For switches, I'm almost convinced that this sorting would be a lot more accurate.
-- ElNando888
----
Worth a try if the coding isn't too much.
-- eternacac
----