Skip to main content

Comparing the EquivalentClassPartitioner and PartitionRefiner : Fullerenes

comment on my previous post reminded me of the "EquivalentClassPartitioner" already in the CDK, written back in 2003 by Junfeng Hao and based on this article by Chang-Yu Hu and Lu Xu. After some testing, it seems they give the same results on various molecules - although, if you only want the equivalence classes the Hu/Xu method is much, much faster. Like 10-100 times faster.

The test molecules I used for comparing speeds are a library of fullerenes that range in size from 20 carbons up to 720. Naturally, I started with the smaller ones, but even there the difference was clear. For instance, here is a table of numbers from the C40 run:



The left-hand column is just the name of the cc1 file, the next two columns are the times for the AtomPartitionRefiner and EquivalentClassPartitioner, and the last two are the order of the automorphism group and the number of equivalence classes. Times are in milliseconds, so clearly the HuXu method is far faster at only one or two ms rather than 30-70 ms. There is presumably some VM speedup going on that accounts for the apparent increase in speed for the first few examples.

It's a similar picture for larger fullerenes : for a typical C92, the PartitionRefiner takes 700 ms and HuXu only 7 ms. There is a speedup for more symmetric molecules - the larger the value of |Aut| (the group order), the faster the PartitionRefiner is. This is to be expected, as it is using the automorphisms to prune the search.

The tests, and some output are in this github repo (fullerene library not included)Finally, here is another image of a fullerene (one of the test cases from the EquivalentClassPartitioner):


because why not. Colourful, is it not?

Comments

Gilleain, do I understand the coloring correctly that this fullerene only has one symmetrical phenyl ring, the outer one?
gilleain said…
That's correct. The gray carbon (39) seems to be 'disordering' the structure from one end. Like a pebble dropped in a still pond. Sort of.

I really should try getting the Schlegel layout stuff working on fullerenes. I got close over the summer, but the optimisation (annealing) part was broken.
John Mayfield said…
In my mind the automorphism group is a lot harder to calculate then the equivalent classes so the speed difference is expected. The code looks great, just finished reviewing and about to sign off.
gilleain said…
True - it is harder to calculate. However, I have a sneaking suspicion that you can get the group from the equivalence classes, and that this might be faster. I don't know what the algorithm is, exactly, as I've tried simple ways to do it, that didn't seem to work...

Popular posts from this blog

Adamantane, Diamantane, Twistane

After cubane, the thought occurred to look at other regular hydrocarbons. If only there was some sort of classification of chemicals that I could use look up similar structures. Oh wate, there is . Anyway, adamantane is not as regular as cubane, but it is highly symmetrical, looking like three cyclohexanes fused together. The vertices fall into two different types when colored by signature: The carbons with three carbon neighbours (degree-3, in the simple graph) have signature (a) and the degree-2 carbons have signature (b). Atoms of one type are only connected to atoms of another - the graph is bipartite . Adamantane connects together to form diamondoids (or, rather, this class have adamantane as a repeating subunit). One such is diamantane , which is no longer bipartite when colored by signature: It has three classes of vertex in the simple graph (a and b), as the set with degree-3 has been split in two. The tree for signature (c) is not shown. The graph is still bipartite accordin

Király's Method for Generating All Graphs from a Degree Sequence

After posting about the Hakimi-Havel  theorem, I received a nice email suggesting various relevant papers. One of these was by Zoltán Király  called " Recognizing Graphic Degree Sequences and Generating All Realizations ". I have now implemented a sketch of the main idea of the paper, which seems to work reasonably well, so I thought I would describe it. See the paper for details, of course. One focus of Király's method is to generate graphs efficiently , by which I mean that it has polynomial delay. In turn, an algorithm with 'polynomial delay' takes a polynomial amount of time between outputs (and to produce the first output). So - roughly - it doesn't take 1s to produce the first graph, 10s for the second, 2s for the third, 300s for the fourth, and so on. Central to the method is the tree that is traversed during the search for graphs that satisfy the input degree sequence. It's a little tricky to draw, but looks something like this: At the top

1,2-dichlorocyclopropane and a spiran

As I am reading a book called "Symmetry in Chemistry" (H. H. Jaffé and M. Orchin) I thought I would try out a couple of examples that they use. One is 1,2-dichlorocylopropane : which is, apparently, dissymmetric because it has a symmetry element (a C2 axis) but is optically active. Incidentally, wedges can look horrible in small structures - this is why: The box around the hydrogen is shaded in grey, to show the effect of overlap. A possible fix might be to shorten the wedge, but sadly this would require working out the bounds of the text when calculating the wedge, which has to be done at render time. Oh well. Another interesting example is this 'spiran', which I can't find on ChEBI or ChemSpider: Image again courtesy of JChempaint . I guess the problem marker (the red line) on the N suggests that it is not a real compound? In any case, some simple code to determine potential chiral centres (using signatures) finds 2 in the cyclopropane structure, and 4 in the