User:ElNando888/WikiGetSat/Ideas/Sequences sorting in labs

From EteRNA WiKi
Jump to: navigation, search

Bottom-level, the discussion.

Compare with

One beautiful feature would be the possibility to wikify those contents.


If I'm not mistaken, sequences in labs can be sorted, and the algorithm currently in use seems to be the Hamming distance.

I'd like to propose a new sorting algorithm (which I dubbed “LDq9”), based on a Lee distance metric with a pseudo-alphabet of size 9 (or more). An example mapping would be:


Which would result in following specific distances:

A:G = 1
U:C = 1

G:C = 4
G:U = 3
A:U = 4
A:C = 4

The basic idea simply being that, changes within the same nucleotide classes (purines or pyrimidines) represent a short distance, while a change of class represent a larger jump.

I believe that this would give a somewhat better view of the similarity of sequences, specially in the context of switches.

-- ElNando888

Nice Idea.

-- jandersonlee

Hmm. Suppose I change a GC bond to CG. How does that get scored? And should it differ if it's a switch lab or not?

-- jandersonlee

GC to CG would be a +8 step.
I don't think the metric should change between static labs and switch ones, but this idea of mine may prove making little difference with a simple Hamming in the case of static target structures.
For switches, I'm almost convinced that this sorting would be a lot more accurate.

-- ElNando888

Worth a try if the coding isn't too much.

-- eternacac

Personal tools
Main page
Introduction to the Game