There was a bug submitted to the CDK sourceforge tracker (bug number 2783741) with a list of molecules that are laid out badly. I had a look at some of them with the help of bioclipse. For example, this calix[4]arene:
which is a clearer case of something going wrong. More difficult is structures that are fully 3D, like:
are truly awful.
which is a clearer case of something going wrong. More difficult is structures that are fully 3D, like:
Can you guess what it is? :) Try the 3D version (also made with bioclipse, using the CDK 3D layout):
It's a paracyclophane! The phenyl rings are lost in the 2D layout because there is a bigger 'ring'. Perhaps a chemist would look at the 3D structure and think that those chains are linkers, not parts of a ring, but the algorithm doesn't know this.
I think that it is difficult to have general rules for this. Of course, any fully 3D structure will be difficult to lay out in 2D (if it is not embeddable in the plane then it is impossible) so things like this:
are truly awful.
Comments
Or, an algorithm to layout ring systems of 10 bonds and larger into a 120 angle layout... sort of like a multi-phenyl system, but then without the inner atoms, and just the edge...
This is when I found that the template matcher took into account bond orders, basically matching these templates only match the cylco-foo-ane, and not the variants with -[di|tri|etc]-enes, -ynes, etc...