Exploring the wild beasts of the layout jungle

There was a bug submitted to the CDK sourceforge tracker (bug number 2783741) with a list of molecules that are laid out badly. I had a look at some of them with the help of bioclipse. For example, this calix[4]arene:

or this:

which is a clearer case of something going wrong. More difficult is structures that are fully 3D, like:

Can you guess what it is? :) Try the 3D version (also made with bioclipse, using the CDK 3D layout):

It's a paracyclophane! The phenyl rings are lost in the 2D layout because there is a bigger 'ring'. Perhaps a chemist would look at the 3D structure and think that those chains are linkers, not parts of a ring, but the algorithm doesn't know this.

I think that it is difficult to have general rules for this. Of course, any fully 3D structure will be difficult to lay out in 2D (if it is not embeddable in the plane then it is impossible) so things like this:

are truly awful.

Comments

Egon Willighagen said…

These larger ring systems are a problem indeed. I recently added some templates to cover some larger ring systems of some size, giving them the more common 120 degree angles, but more are needed...

Or, an algorithm to layout ring systems of 10 bonds and larger into a 120 angle layout... sort of like a multi-phenyl system, but then without the inner atoms, and just the edge...

This is when I found that the template matcher took into account bond orders, basically matching these templates only match the cylco-foo-ane, and not the variants with -[di|tri|etc]-enes, -ynes, etc...

1 May 2009 at 06:23

Unknown said…

The current standard in SDG is set by the guys from CCG (http://dx.doi.org/10.1021/ci050550m). Those large rings, btw, can be easily laid-out with a honey-comb-embedding algorithm and need no templates.

4 May 2009 at 12:10

Egon Willighagen said…

Yes, indeed, no honey-cumb templates needed once we have an implementation of that algorithm.

4 May 2009 at 12:15

chalky

I wanted to show something that hints at the things that the new architecture can afford us: This is using a Java2D graphics Paint object to make it look like chalk...kindof. It's a very simplistic way of doing it by making a small image with a random number of white, gray, lightgray, and black pixels. edit: it doesn't look so good at small scales some tweaking of stroke widths and so on is essential.

Adamantane, Diamantane, Twistane

After cubane, the thought occurred to look at other regular hydrocarbons. If only there was some sort of classification of chemicals that I could use look up similar structures. Oh wate, there is . Anyway, adamantane is not as regular as cubane, but it is highly symmetrical, looking like three cyclohexanes fused together. The vertices fall into two different types when colored by signature: The carbons with three carbon neighbours (degree-3, in the simple graph) have signature (a) and the degree-2 carbons have signature (b). Atoms of one type are only connected to atoms of another - the graph is bipartite . Adamantane connects together to form diamondoids (or, rather, this class have adamantane as a repeating subunit). One such is diamantane , which is no longer bipartite when colored by signature: It has three classes of vertex in the simple graph (a and b), as the set with degree-3 has been split in two. The tree for signature (c) is not shown. The graph is still bipartite accordin...

1,2-dichlorocyclopropane and a spiran

As I am reading a book called "Symmetry in Chemistry" (H. H. Jaffé and M. Orchin) I thought I would try out a couple of examples that they use. One is 1,2-dichlorocylopropane : which is, apparently, dissymmetric because it has a symmetry element (a C2 axis) but is optically active. Incidentally, wedges can look horrible in small structures - this is why: The box around the hydrogen is shaded in grey, to show the effect of overlap. A possible fix might be to shorten the wedge, but sadly this would require working out the bounds of the text when calculating the wedge, which has to be done at render time. Oh well. Another interesting example is this 'spiran', which I can't find on ChEBI or ChemSpider: Image again courtesy of JChempaint . I guess the problem marker (the red line) on the N suggests that it is not a real compound? In any case, some simple code to determine potential chiral centres (using signatures) finds 2 in the cyclopropane structure, and 4 in the ...

Some Stuff

Search This Blog