Comparing the EquivalentClassPartitioner and PartitionRefiner : Fullerenes

comment on my previous post reminded me of the "EquivalentClassPartitioner" already in the CDK, written back in 2003 by Junfeng Hao and based on this article by Chang-Yu Hu and Lu Xu. After some testing, it seems they give the same results on various molecules - although, if you only want the equivalence classes the Hu/Xu method is much, much faster. Like 10-100 times faster.

The test molecules I used for comparing speeds are a library of fullerenes that range in size from 20 carbons up to 720. Naturally, I started with the smaller ones, but even there the difference was clear. For instance, here is a table of numbers from the C40 run:

The left-hand column is just the name of the cc1 file, the next two columns are the times for the AtomPartitionRefiner and EquivalentClassPartitioner, and the last two are the order of the automorphism group and the number of equivalence classes. Times are in milliseconds, so clearly the HuXu method is far faster at only one or two ms rather than 30-70 ms. There is presumably some VM speedup going on that accounts for the apparent increase in speed for the first few examples.

It's a similar picture for larger fullerenes : for a typical C92, the PartitionRefiner takes 700 ms and HuXu only 7 ms. There is a speedup for more symmetric molecules - the larger the value of |Aut| (the group order), the faster the PartitionRefiner is. This is to be expected, as it is using the automorphisms to prune the search.

The tests, and some output are in this github repo (fullerene library not included)Finally, here is another image of a fullerene (one of the test cases from the EquivalentClassPartitioner):

because why not. Colourful, is it not?


Gilleain, do I understand the coloring correctly that this fullerene only has one symmetrical phenyl ring, the outer one?
gilleain said…
That's correct. The gray carbon (39) seems to be 'disordering' the structure from one end. Like a pebble dropped in a still pond. Sort of.

I really should try getting the Schlegel layout stuff working on fullerenes. I got close over the summer, but the optimisation (annealing) part was broken.
J May said…
In my mind the automorphism group is a lot harder to calculate then the equivalent classes so the speed difference is expected. The code looks great, just finished reviewing and about to sign off.
gilleain said…
True - it is harder to calculate. However, I have a sneaking suspicion that you can get the group from the equivalence classes, and that this might be faster. I don't know what the algorithm is, exactly, as I've tried simple ways to do it, that didn't seem to work...