Skip to main content

Posts

Showing posts from 2013

Festive Chemical Structure Generation : Necklaces and Trees!

So, a student asked me about a homework question that is a sub-problem of the structure generation problem . Basically, it was to count  the number of chemical structures with exactly one  cycle given the elemental formula. Of course, the best solution here is probably to use the Polyá Enumeration Theorem  since all that was asked for was a count (enumeration) of the structures. Naturally, I have a different way to do this - especially since I don't really understand the mathematics of PET enough to implement it. So: The image shows a rough overview of how I might list  all of the structures with a single cycle. It takes a number of necklaces (one shown), and a number of trees, and glues the one to the other in all possible ways. The word ' necklace ' here is specifically the combinatorial object; so  the cyclic sequence CNCNO is the same as CNOCN since you can rotate one to get the other. One tricky decision here is whether to add multiple bonds to the necklace be

Comparing Kiraly (Exhaustive) Graph Generation with nAUTy Output

So recently I was asked about Király's method for generating all graphs from a degree sequence . While refactoring some of the code that I wrote to do this, I also made some tests. Specifically, coverage tests to check that the generation was actually exhaustive. I know it's redundant, but I have good tools to remove duplicate graphs - or I thought that I did… Here a rough flowchart of the procedure here, starting with a number ('n') that is passed to Dreadnaut (the interface to nAUTy ) to generate graphs: These graphs are grouped by degree sequence, and these degree sequences are fed into the KirályHHGenerator to reconstruct the set of graphs. I think that compare arrow is wrong, but never mind. The point is that the sets should be the same size. They are for n=5,6,7 but not for 8. Oddly enough, however, there are more  in the Király set than in the nAUTy set. The obvious conclusion would be that my duplicate detection is failing - in other words, I am failing

Centrality as a Vertex Invariant (or 'Atom Descriptor')

EDIT : After some more tests, I now realise that this is not really as great a vertex label/descriptor as I thought it was. For example, see these four graphs on 7 vertices that fail to distinguish vertices properly: The first one should have a central vertex in a different class than the other blue vertices. The green class in the second graph should be split, and same for the third graph. And so on. So, in the last post I talked about the ideas of Randić et al for calculating the 'centrality' of vertices in a graph. Interestingly, the numbers calculated for each vertex act as a kind of equivalence class label or vertex invariant. This is similar in many ways to Morgan numbers  (sorry, Egon's post doesn't actually explain them, but they are the sum of degrees across extended neighbourhoods). For example, here is one of the examples from the previous post: With the centrality matrix in the middle, and the 'label' made by sorting the row eleme

Common Vertex Matrices of Graphs

There is an interesting set of papers out this year by Milan Randic et al (sorry about the accents - blogger seems to have a problem with accented 'c'...). I've looked at his work before here . [1]  Common vertex matrix: A novel characterization of molecular graphs by counting [2]  On the centrality of vertices of molecular graphs and one still in publication to do with fullerenes. The central idea here (ho ho) is a graph descriptor a bit like path lengths called 'centrality'. Briefly, it is the count of neighbourhood intersections between pairs of vertices. Roughly this is illustrated here: For the selected pair of vertices, the common vertices are those at the same distance from each - one at a distance of two and one at a distance of three. The matrix element for this pair will be the sum - 2 - and this is repeated for all pairs in the graph. Naturally, this is symmetric: At the right of the matrix is the row sum (∑) which can be ordered to

Tutte's Twist Operation on Cubic Graphs

There is an interesting book by W. T. Tutte  called ' Graph Theory as I have known it ' which is a cross between a normal mathematical text and a biography. So it's a description of the areas he was interested in, and his theorems. One thing that interested me was the use of a 'twist' operation on cubic graphs like so: Where for the edge between vertices x and y labelled 'A' we reconnect the surrounding edges to form the arrangement on the right hand side. So detach edge D from y and connect it to x, and vice versa with edge C. The lower part of the picture shows what happens for a loop-edge - it transforms to a multi-edge. This operation is used on a family of 'base' graphs looking like this: with the first in the list is a vertexless loop graph - that is, it has no vertices and a single edge. From these base graphs, the twist operation can form any cubic graph. Note that all of U n are cubic with 2n vertices. For example, from U 3 we

Signatures with user-defined edge colors

A bug in the CDK implementation of my signature library  turned out to be due to the fact that the bond colors were hard coded to just recognise the labels {"-", "=", "#" }. The relevant code section even had an XXX above it ! Poor show, but it's finally fixed now. So that means I can handle user-defined edge colors/labels - consider the complete graph (K5) below: So the red/blue colors here are simply those of a chessboard imposed on top of the adjacency matrix - shown here on the right. You might expect there to be at least two vertex signature classes here : {0, 2, 4} and {1, 3} where the first class has vertices with two blue and two red edges, and the second has three blue and two red. Indeed, here's what happens for K4 to K7: Clearly even-numbered complete graphs have just one vertex class, while odd-numbered ones have two (at least?). There is a similar situation for complete bipartite graphs: Although I haven't explored

Visualising Ring Equivalence Classes in Jmol

As promised (in the previous post ) I've now made Jmol scripts to show the atom/ring equivalence classes. I still think that the ring ones are more clear, but I suppose it depends on what aspect of the symmetry of the structure is needed. As an example:  Shown here is a C70 structure, with coloured circular plates at the centre of each face. It should be clear that there is an axis of symmetry running through the middle, from one blue plate to the other. Around the blue is a ring of green, and 5 rings in between. The slight difficulty in all this was working out the ring equivalence classes. There is an existing CDK method to do this - in the SSSR ring finder - but it seems to give too many classes. The way I did it was to first find atom equivalence classes (or 'orbits') using signatures. Then each ring is a circular list of the orbit indices : which I'm going to call a 'ring code'. See this image for illustration: These two rings (A and B) have the s

Blowing Carbon Bubbles : Expanding 2D Fullerene Layouts to 3D

The concentric face layout code is working well enough now to handle the larger fullerenes - such as that old favourite , C60. Since coloring the vertices by equivalence class is not always terribly informative, here is a view of the ring equivalence classes : Where C60 is on the left, and a more colourful C70-D5h is on the right. One difficulty, however, is to understand the symmetries of these structures when they are distorted like this. The further away from the center of the layout, the more stretched the rings become. So, an obvious next step was to 'blow up' these 2D layouts into 3D. It turns out that is possible, with a combination of inverse stereographic projection  and Jmol 's minimize command. The first step is necessary since minimizing the 2D coordinates (with a z-coord of zero) just shrinks the diagram down in the plane. Here are before and after shots of these steps: Clearly the inverse-projection does not give very good 3D positions for the

Fullerene Layout with Spokes and Arches

Having tried (and failed) to layout fullerene structures using various optimisation methods, I thought I would try direct positioning of the atoms. In other words, 'logical' placement rather than 'physics' based layout. For example: These are two regular fullerenes that work very well. The algorithm is simple in principle: 1) Given a planar embedding G , calculate the inner dual   id(G) and the 'face layers'. 2) The innermost layer is the 'core' which is one of: a single vertex, a connected pair, or a cycle. 3) Layout the core, and then each layer outwards, by spoke and arch. So, to explain some of this; a 'face layer' is a set of faces all at the same distance from the outer cycle, measured by graph distance on id(G) . So the faces adjacent to the outer cycle are the first layer, and the second layer is adjacent to that, and so on. This is roughly illustrated here: The concentric circles represent the layers of faces, with th