Skip to main content

Posts

Showing posts from January, 2009

Spectral Full House

So, all of the isomers of C4H11O are in NMRShiftDB and here are all the experimental and predicted carbon spectra:
It's not obvious from this picture, but not all of the predicted spectra are unique matches for their experimental partners. In other words, you could not pick out the right molecule by comparing the predicted and experimental spectra.
The situation is more difficult still for larger isomer spaces, where the predicted spectra may be exactly the same for sub-sets of the isomers. There are still many with unique predictions, but the rest follow a sort of power-law distribution of spectral-equivalent sets.
EDIT: As per a suggestion by egon, here is a table of top hits (a yellow square indicates the top match):

C4H11N network, labelled with pictures

A really tiny isomer set, with a particularly regular network:

Nicely enough, all 8 of these structures are in NMRShiftDB, so it will be possible to compare all against all.

One with smiles strings on

Slightly more useful/comprehensible.
Also, this is a more interesting isomer space (C3H7ON) which is more fragmented at a 50% similarity cutoff.

just one more

C8H16 this time. The central bridge has split into two parts, it seems. The clustering is somewhat artificial, I suppose...
An example from one of these is C=C(C(C)C)C(C)C, and from the other is C=CC(CC)CCC. Hmmm.

another network

Another one (C7H14 this time):
It's a bit cumbersome, but I managed to get smiles strings to show as tooltip text. This tells me that the three vertices in the center of this picture (the bridging ones) are C=C(C)CCCC, C=C(CC)C(C)C, and C=C(CC)CCC. Not sure why.

Oh, and the 'lobe' on the left seem to be (all?) non-cyclic, while the ones on the right seem to be cyclic, which makes sense.

isomer networks

hmmm. Been a month. Oh well. Here is a picure of an 'isomer network':
for the isomers of C6H12. I generated them using molgen, compared the structures all-v-all with the CDK, and visualized the connectivity graph with the help of JUNG. So hardly any work actually by me...

Each vertex in the graph is a structure, and each edge is a tanimoto similarity between fingerprints of greater than 0.5 - fairly arbitrary, but I just wanted to see if it worked. The next step is to use predicted spectrum similarity instead of molecular-fingerprint similarity.