<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-123313693388384663</id><updated>2012-02-03T13:47:44.827Z</updated><category term='structure generation'/><category term='Anything'/><category term='bioclipse'/><category term='annealing'/><category term='taverna'/><category term='whiteboard'/><category term='signatures'/><category term='isomerspaces'/><category term='Seneca'/><category term='NMR spectrum comparison'/><category term='scientific publications'/><category term='pubchem'/><category term='molgen'/><category term='UML'/><category term='CML'/><category term='group theory'/><category term='JChemPaint'/><category term='ChEBI'/><category term='CASE'/><category term='JCP'/><category term='beans'/><category term='Medea'/><category term='JUNG'/><category term='cdkws2009'/><category term='cdk'/><category term='eclipse'/><category term='2D molecule layout'/><category term='dependency analysis'/><category term='classcycle'/><category term='polyhedra'/><title type='text'>Some Stuff</title><subtitle type='html'>An Online Research Notebook</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><link rel='next' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default?start-index=101&amp;max-results=100'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>129</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-5021467741700127411</id><published>2011-08-31T17:38:00.005+01:00</published><updated>2011-08-31T18:26:26.336+01:00</updated><title type='text'>McKay's canonical augmentation method explained for simple graphs</title><content type='html'>&lt;p&gt;The &lt;a href="http://gilleain.blogspot.com/2011/08/generating-chessboards-with-k.html"&gt;previous post&lt;/a&gt; talked about generating one type of combinatorial object (chessboards) using a method similar to that outlined by Brendan McKay in a paper called "&lt;i&gt;&lt;a href="http://cs.anu.edu.au/~bdm/publications.html"&gt;Isomorph-free exhaustive generation&lt;/a&gt;&lt;/i&gt;" (J Algorithms, 26 (1998) 306-324.). This one will focus instead on simple graphs, which requires both parts of the method.&lt;/p&gt;&lt;p&gt;The &lt;i&gt;canonical construction&lt;/i&gt; (or &lt;i&gt;canonical augmentation&lt;/i&gt;) method has two components. Firstly, only one 'expansion' of a graph is tried at each step from the set of equivalent expansions. Secondly, the expansions are checked to see if they are the inverse of a 'canonical deletion' for that graph.&lt;/p&gt;&lt;p&gt;For an example of the first rule, consider this set of expansions of a 4-vertex graph on the left:&lt;/p&gt;&lt;a href="http://4.bp.blogspot.com/-n2xvoYboQQ0/Tl5nhd6iGCI/AAAAAAAAAc8/0Ehf6v9xsIU/s1600/orbit_reps_paw.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 230px;" src="http://4.bp.blogspot.com/-n2xvoYboQQ0/Tl5nhd6iGCI/AAAAAAAAAc8/0Ehf6v9xsIU/s320/orbit_reps_paw.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5647064807432656930" /&gt;&lt;/a&gt;Each of the 5-vertex graphs on the right are shown with the newly added vertex and edges in red; the arrows are labelled by the added edge set - so {1:4, 3:4} means edges added from 1 to 4 and 3 to 4. The sets of vertices to add to - {{0}, {1}, {1,3}} - are representatives of &lt;a href="http://en.wikipedia.org/wiki/Orbit_(group_theory)#Orbits_and_stabilizers"&gt;the orbit&lt;/a&gt; of these vertices. For example, the orbit of {1} in G(4) is {1, 2} as these two vertices &lt;a href="http://en.wikipedia.org/wiki/Equivalence_class"&gt;are equivalent&lt;/a&gt; in G(4) on the left.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This is now quite similar to the situation &lt;a href="http://gilleain.blogspot.com/2011/08/generating-chessboards-with-k.html"&gt;with chessboards&lt;/a&gt; : trying only minimal orbit representatives for extending an object. In McKay's paper, the process of generating child objects is split into 'upper' and 'lower' objects. An upper object is a pair &lt;x, w=""&gt; where X is (say) a graph, and W is a set of vertices to connect to a new vertex. A lower object is a pair &lt;x, v=""&gt; where v is a vertex to delete. This is illustrated here:&lt;/x,&gt;&lt;/x,&gt;&lt;/div&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/-fviW4n5Lnuk/Tl5qPE9OWnI/AAAAAAAAAdE/xbs2Z3Io-Zo/s1600/mckay_central_objects.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 238px;" src="http://3.bp.blogspot.com/-fviW4n5Lnuk/Tl5qPE9OWnI/AAAAAAAAAdE/xbs2Z3Io-Zo/s320/mckay_central_objects.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5647067790030297714" /&gt;&lt;/a&gt;&lt;br /&gt;Click for bigger, as usual. There is a function shown between a lower object for X' and an upper object for X. This is the 'deletion' function, and its inverse is the important one : f&lt;sup&gt;-1&lt;/sup&gt;, the function that adds a new vertex by connecting it to all the vertices in W.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This process will generate isomorphic graphs, so there has to be a way to reject children that are not canonical. This is where the second part comes in ... unfortunately it is harder to describe.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Roughly, we need to check that the newly added vertex is the one that should have been added if it was canonical. To verify this, the child graph is canonically labelled (eg : &lt;a href="http://gilleain.blogspot.com/2010/05/stuck-detailed-description.html"&gt;see this post&lt;/a&gt;, or possibly &lt;a href="http://gilleain.blogspot.com/2008/10/on-canonical-numberings.html"&gt;this one&lt;/a&gt;) and then the code checks if the added vertex (under the canonical labelling) is in the same orbit as the last one. Kind of.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The upshot is that &lt;a href="https://github.com/gilleain/mathgraphs"&gt;this code&lt;/a&gt; now produces results very similar to &lt;a href="http://cs.anu.edu.au/~bdm/nauty/"&gt;nauty&lt;/a&gt; (geng) for graphs up to 8-12 vertices. For the larger numbers, I started to restrict the maximum degree, to shorten the runtime. It's definitely not as fast as nauty, but not too bad. I still have the lingering suspicion that I might start missing graphs for larger spaces, but it's not bad, not bad at all...&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-5021467741700127411?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/5021467741700127411/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=5021467741700127411' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/5021467741700127411'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/5021467741700127411'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2011/08/mckays-canonical-augmentation-method.html' title='McKay&apos;s canonical augmentation method explained for simple graphs'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-n2xvoYboQQ0/Tl5nhd6iGCI/AAAAAAAAAc8/0Ehf6v9xsIU/s72-c/orbit_reps_paw.png' height='72' width='72'/><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-6101752394012196009</id><published>2011-08-04T11:02:00.007+01:00</published><updated>2011-08-04T13:33:33.696+01:00</updated><title type='text'>Generating Chessboards With K-Independant Vertex Sets</title><content type='html'>After looking at &lt;a href="http://juliopeironcely.com/archives/poster-for-the-9th-international-conference-on-chemical-structures"&gt;Julio Peironcely's poster&lt;/a&gt; on generating chemical structures, where he describes using Canonical Path Augmentation (I think due to &lt;a href="http://cs.anu.edu.au/~bdm/"&gt;Brendan McKay&lt;/a&gt;) I went looking for more about it. One thing I found was &lt;a href="http://www.liga.ens.fr/~dutour/Presentations/CombEnumerationExpand.pdf"&gt;this talk/slideshow&lt;/a&gt; by Mathieu Dutour Sikirić - incidentally coauthor of a nice &lt;a href="http://www.cambridge.org/gb/knowledge/isbn/item1174335/?site_locale=en_GB"&gt;book on Chemical Graphs&lt;/a&gt;.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Anyway; that's the context. Now : about chessboards? Well one of the examples given for augmentation (or 'orderly') schemes of &lt;a href="http://en.wikipedia.org/wiki/Independent_set_(graph_theory)"&gt;independent vertex sets&lt;/a&gt;. I'm not sure what made me think of chessboards for this, but I think it's a fairly standard simple toy example. Let me show an example for 3x3 boards:&lt;/div&gt;&lt;div&gt;&lt;a href="http://1.bp.blogspot.com/-ZNItQ6PvuT8/Tjpy7W1vEcI/AAAAAAAAAck/j4WM45AGkv0/s1600/three_by_three.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 259px;" src="http://1.bp.blogspot.com/-ZNItQ6PvuT8/Tjpy7W1vEcI/AAAAAAAAAck/j4WM45AGkv0/s320/three_by_three.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5636944247676408258" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So, these are all the three by three boards where no two black squares share an edge. In other words, if the board is considered as a grid-shaped graph, then these are the k-independent vertex sets. So the question is : how to generate these?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The simple way, of course, is just to fill in every square and eliminate those boards that have pairs of black squares across an edge. This is the 'brute force' approach, and scales badly : there are 2&lt;sup&gt;9&lt;/sup&gt; boards but we only want 20 of these, or just 3%.  For 4x4 boards, this fraction is even smaller - 131  of 2&lt;sup&gt;16&lt;/sup&gt; boards which is only 0.19%. This is 0.0026% for 5x5 boards.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So, the number of boards increases rapidly as the size increases and any way of decreasing this large search space is essential. The approach outlined in  Mathieu's talk is quite simple : only try sets (boards) that are the minimal representative in their orbit. Ok, so maybe that doesn't sound so simple :) but look at this:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/-_G_24B2dDqY/TjqKxkTYP9I/AAAAAAAAAcs/RN4_lx7XAtg/s1600/cell_orbits.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 181px;" src="http://3.bp.blogspot.com/-_G_24B2dDqY/TjqKxkTYP9I/AAAAAAAAAcs/RN4_lx7XAtg/s320/cell_orbits.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5636970467770777554" /&gt;&lt;/a&gt;Here are 3x3 boards, with a numbering (any one will do), the equivalence classes of the cells, and a set of orbits. Lets say we've just generated {0, 4, 6} : how do we check if it is the minimal representative? Well, so long as we have the automorphism group of the board we just apply each permutation in the group to the set of numbers and check to see if any are smaller. In this case, {0, 2, 4} is smaller, so we don't consider {0, 4, 6}.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It's really as simple as that. Of course, we have to check each newly added cell to make sure it is not adjacent, but that can be done without consideration of previously generated boards. This means we don't have to store solutions, which means it can be done in parallel without communication. See how many there are just for 5x5 boards:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-WT7lnAE0mjM/TjqPceObMOI/AAAAAAAAAc0/wKdopV14-44/s1600/five_by_five.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 256px;" src="http://1.bp.blogspot.com/-WT7lnAE0mjM/TjqPceObMOI/AAAAAAAAAc0/wKdopV14-44/s320/five_by_five.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5636975602920272098" /&gt;&lt;/a&gt;I suppose that one last question about all this is : so what? Does this have anything to do with chemistry? Well, actually, it does. Consider &lt;a href="http://gilleain.blogspot.com/2011/04/colored-tree-paths-to-represent-bond.html"&gt;the bond order assignment problem&lt;/a&gt; : the second image shows all the possible assignments, which you may notice has duplicates (3, 5, 10, and 12 for example). Also consider &lt;a href="http://gilleain.blogspot.com/2010/08/line-graphs-and-double-bonding-systems.html"&gt;the line graph approach to double bond systems&lt;/a&gt; where again the second image shows a pair of colorings of line graphs. In fact, these are k-independent sets...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Oh, and there &lt;a href="https://github.com/gilleain/chessboards"&gt;is code here&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-6101752394012196009?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/6101752394012196009/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=6101752394012196009' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/6101752394012196009'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/6101752394012196009'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2011/08/generating-chessboards-with-k.html' title='Generating Chessboards With K-Independant Vertex Sets'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-ZNItQ6PvuT8/Tjpy7W1vEcI/AAAAAAAAAck/j4WM45AGkv0/s72-c/three_by_three.png' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-4560471374827286649</id><published>2011-06-12T21:48:00.004+01:00</published><updated>2011-06-13T22:57:35.613+01:00</updated><title type='text'>Ring Plate Visualisation</title><content type='html'>One diagram I've often wanted to make was filled-in rings in molecules :&lt;div&gt;&lt;a href="http://1.bp.blogspot.com/-L-BKQXv84E0/TfUmTk53b3I/AAAAAAAAAb8/5QDzHu4dRdA/s1600/steran.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 320px;" src="http://1.bp.blogspot.com/-L-BKQXv84E0/TfUmTk53b3I/AAAAAAAAAb8/5QDzHu4dRdA/s320/steran.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5617438227980316530" /&gt;&lt;/a&gt;&lt;/div&gt;Mainly for the purposes of highlighting rings without highlighting the atoms involved. This image was made using the CDK renderbasic module, and a small toy AWTRenderingVisitor that fills in paths. I'm not sure if the current one does this...&lt;br /&gt;&lt;br /&gt;The gist for the main drawing method &lt;a href="https://gist.github.com/1021964"&gt;is here&lt;/a&gt; but is really just stuff seen before. The custom generator for rings is probably more interesting - if very simple - and &lt;a href="https://gist.github.com/1021965"&gt;is here&lt;/a&gt;. Note that it doesn't look too nice with inner-ring double bonds:&lt;div&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/-X-38D9l8zo0/TfUnU_GUwjI/AAAAAAAAAcE/rua8GJhGyxQ/s1600/peb.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 320px;" src="http://3.bp.blogspot.com/-X-38D9l8zo0/TfUnU_GUwjI/AAAAAAAAAcE/rua8GJhGyxQ/s320/peb.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5617439351703388722" /&gt;&lt;/a&gt;and would look nicer if the double bonds were symmetric.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;EDIT : Coloring by ring equivalence classes didn't do what I expected...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-pS5-zm2pZHo/TfaHYmyqtoI/AAAAAAAAAcM/Q5jpHt3WqvA/s1600/coronene.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 320px;" src="http://1.bp.blogspot.com/-pS5-zm2pZHo/TfaHYmyqtoI/AAAAAAAAAcM/Q5jpHt3WqvA/s320/coronene.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5617826441990944386" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Shouldn't all the outer rings be in the same class? Steran is how I expect, though:&lt;/div&gt;&lt;a href="http://3.bp.blogspot.com/-RbHih7JMNBs/TfaHrkq0pGI/AAAAAAAAAcU/JntepHhYKWg/s1600/steran.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 320px;" src="http://3.bp.blogspot.com/-RbHih7JMNBs/TfaHrkq0pGI/AAAAAAAAAcU/JntepHhYKWg/s320/steran.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5617826767838684258" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-4560471374827286649?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/4560471374827286649/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=4560471374827286649' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/4560471374827286649'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/4560471374827286649'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2011/06/ring-plate-visualisation.html' title='Ring Plate Visualisation'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-L-BKQXv84E0/TfUmTk53b3I/AAAAAAAAAb8/5QDzHu4dRdA/s72-c/steran.png' height='72' width='72'/><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-257739241711544084</id><published>2011-05-29T15:01:00.005+01:00</published><updated>2011-05-29T16:24:59.603+01:00</updated><title type='text'>External Symmetry Numbers and Graph Automorphism Groups</title><content type='html'>So, there was a question on &lt;a href="http://biostar.stackexchange.com/questions/8208/open-source-molecular-symmetry-perception-tools/"&gt;BioStar&lt;/a&gt; about calculating the 'external symmetry number' of a molecule - something I hadn't heard of, but turns out to be something like the subgroup of rotations and reflections of the automorphism group of a graph. Since I have some code to calculate the automorphism group, I naïvely thought it would be simple...&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The questioner - &lt;a href="http://biostar.stackexchange.com/users/1870/nick-vandewiele"&gt;Nick Vandewiele&lt;/a&gt; - kindly provided some test cases, which ended up as &lt;a href="https://github.com/gilleain/cdk_signature/blob/9c19bca7c52b11f140abdaa09af1cdb469d4571e/src/test_group/AutomorphismTest.java"&gt;this code&lt;/a&gt;. Although many of these tests now pass, they only do so because I commented out the hydrogen adding! :)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;On the one hand, there are some recent improvements that try to handle vertex and edge 'colors' - in other words, element symbols and bond orders. For example, consider the improbable molecule C1OCO1 :&lt;/div&gt;&lt;div&gt;&lt;a href="http://3.bp.blogspot.com/-hRPSnfLrl3E/TeJWG6jXWoI/AAAAAAAAAbg/ZPeOOS2rPeM/s1600/coco_symmetries.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 214px;" src="http://3.bp.blogspot.com/-hRPSnfLrl3E/TeJWG6jXWoI/AAAAAAAAAbg/ZPeOOS2rPeM/s320/coco_symmetries.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5612142762453850754" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;These are the three permutations that leave the carbons and the oxygens in the same positions; when you include the identity, that makes 4. Cyclobutane (without hydrogens!) has a symmetry group of order 8. Similarly, cyclobutadiene now gives 4 instead of 8.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So what goes wrong when there are hydrogens? Well, it's a deeper problem than just hydrogens, but it starts there. Consider methane : it has an external symmetry number of 12, but my code gives 24 - why? Well the main answer is 'inversion', look:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-dovpHej_TAc/TeJdTd7Kn7I/AAAAAAAAAbo/faPQgr7c-9s/s1600/methane_inversion.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 119px;" src="http://4.bp.blogspot.com/-dovpHej_TAc/TeJdTd7Kn7I/AAAAAAAAAbo/faPQgr7c-9s/s320/methane_inversion.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5612150674688745394" /&gt;&lt;/a&gt;The permutation (0)(1)(2, 3)(4) just swaps hydrogens 2 and 3. This effectively changes the chirality of the molecule ... sortof. It's not actually chiral, but its a reasonable description of the transformation. Apparently, this &lt;i&gt;does&lt;/i&gt; happen (another thing I didn't know; there are lots more :) according to &lt;a href="http://physics.nist.gov/Pubs/Methane/chap03.html"&gt;this document&lt;/a&gt;, but quite slowly compared to rotations - "slower than 1 cycle s&lt;sup&gt;-1&lt;/sup&gt;".&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This kind of &lt;i&gt;pseudo-&lt;/i&gt;chirality will happen at any tetrahedral center. Or at any atom with 4 neighbours, I think - like XeF4, which is square planar. As an example, take this &lt;i&gt;spira-&lt;/i&gt;fused ring system:&lt;/div&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/-v4ELk7JJDVI/TeJiZ5571OI/AAAAAAAAAbw/sPHCixQMjLc/s1600/spira_chiral_transform.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 201px;" src="http://3.bp.blogspot.com/-v4ELk7JJDVI/TeJiZ5571OI/AAAAAAAAAbw/sPHCixQMjLc/s320/spira_chiral_transform.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5612156282837128418" /&gt;&lt;/a&gt;with a transform that swaps 7 and 9 but not the pairs (0, 5)(1, 4)(2, 3). Effectively this changes the parity at carbon 6. Somehow I doubt that this kind of 'movement' actually occurs in solution, but I could well be wrong. In any case, it seems likely that the external symmetry number is 2, and not 4.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In summary, it is probably not possible to calculate the external symmetry number correctly without 3D coordinates, or symmetry axes, or point groups. I have a feeling that the positional info could be recorded as a 3D &lt;a href="http://gilleain.blogspot.com/2011/03/final-post-on-combinatorial-maps-of.html"&gt;combinatorial map&lt;/a&gt; which would give explicit orientations for atoms with four neighbours.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-257739241711544084?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/257739241711544084/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=257739241711544084' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/257739241711544084'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/257739241711544084'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2011/05/external-symmetry-numbers-and-graph.html' title='External Symmetry Numbers and Graph Automorphism Groups'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-hRPSnfLrl3E/TeJWG6jXWoI/AAAAAAAAAbg/ZPeOOS2rPeM/s72-c/coco_symmetries.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-3539563855621726297</id><published>2011-04-10T16:45:00.004+01:00</published><updated>2011-04-10T17:29:04.795+01:00</updated><title type='text'>Colored Tree Paths to Represent Bond Order Assignements</title><content type='html'>A couple of different sources pointed me to &lt;a href="http://bioinformatics.oxfordjournals.org/content/27/5/619.short"&gt;a paper&lt;/a&gt; by Dehof et al (Bioinformatics, 2011; doi: 10.1093/bioinformatics/btq718) about bond order assignment by various methods. One of them is the &lt;a href="http://en.wikipedia.org/wiki/A*_search_algorithm"&gt;A*-algorithm&lt;/a&gt;, which finds paths. Not in the molecule graph, but in a kind of tree of possible bond combinations. Precisely, the tree has a layer for each decision, and a leaf for each combination of decisions.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For example, here is the usual example of a square four-cycle (G):&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-VgzmbEinnLQ/TaHY4DL66uI/AAAAAAAAAbQ/ZyeaSJ3oPTk/s1600/bond_tree_cyclobutane.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 238px;" src="http://2.bp.blogspot.com/-VgzmbEinnLQ/TaHY4DL66uI/AAAAAAAAAbQ/ZyeaSJ3oPTk/s320/bond_tree_cyclobutane.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5593990669610445538" /&gt;&lt;/a&gt;with the tree shown below. If we pick a particular path through the tree - say to leaf 5 - we get the sequence of colors in the box in the upper center of the image. This corresponds to the assignment on the upper right. Clearly, it has too many double bonds to be carbon skeleton, but still.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So, it's a nice, simple way to represent the set of solutions. Well, it could be even simpler but representing it as a tree lets the paper authors assign weights to the edges based on the chemical rules outlined in work they reference (Wang et al, 2006). What happens if you don't have any weights? Here are all the assignments from the tree:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://3.bp.blogspot.com/-CdCbWTCXNm4/TaHaHQcKb2I/AAAAAAAAAbY/52E3JJ53y_I/s1600/bond_tree_orbits.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 293px;" src="http://3.bp.blogspot.com/-CdCbWTCXNm4/TaHaHQcKb2I/AAAAAAAAAbY/52E3JJ53y_I/s320/bond_tree_orbits.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5593992030377897826" /&gt;&lt;/a&gt;where the assignments are labelled by the leaf number (0-15) and have a letter label (A-F). These letters stand for 'equivalence classes' - really just isomorphism classes. Oh, and the grey vertices have unlikely atom types for carbons. &lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-3539563855621726297?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/3539563855621726297/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=3539563855621726297' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/3539563855621726297'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/3539563855621726297'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2011/04/colored-tree-paths-to-represent-bond.html' title='Colored Tree Paths to Represent Bond Order Assignements'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-VgzmbEinnLQ/TaHY4DL66uI/AAAAAAAAAbQ/ZyeaSJ3oPTk/s72-c/bond_tree_cyclobutane.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-4720170182643089029</id><published>2011-03-24T18:39:00.005Z</published><updated>2011-03-25T19:18:09.203Z</updated><title type='text'>Final post on Combinatorial Maps of Cuneane</title><content type='html'>EDIT : Updated with new image.&lt;br /&gt;Well, unless I can make a more horrifying diagram than this:&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/-a25Gg3T1pyQ/TYy24yYEXLI/AAAAAAAAAaw/09wNujiqk-o/s1600/cuneane_darts345.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 265px;" src="http://2.bp.blogspot.com/-a25Gg3T1pyQ/TYy24yYEXLI/AAAAAAAAAaw/09wNujiqk-o/s320/cuneane_darts345.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5588042324371594418" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;..but I think that's unlikely. An explanation - if such a thing is possible - is that the red/purple arrows correspond to a kind of inside-out operation. So the triangular map (M3) at the bottom can be converted to the square (M4) by choosing the purple darts (2, 14, 16, 5) as the boundary, and putting everything else on the inside. The cycle (1, 4, 9, 20, 22) is red because it is the boundary of the pentagonal map (M5).&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;s&gt;So, it seems like ϕ(M4) is α(ϕ(M5)) - so the cycle (0,8,23,21,5) in ϕ(M4) is (1,9,22,20,4) in ϕ(M5). That is, because (α[0] = 1, α[8] = 9, α[23] = 22 ...). However ϕ(M3) = ϕ(M5), which is confusing. Oh well&lt;/s&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So I have updated the image so that all three 'maps' have the same cycles. Which means they have the same ϕ and therefore the same σ. I guess this means I misunderstood what a CM actually is : you can get 'different' embeddings with the same σ. Here is a summary diagram:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/-aaSowV7vk5s/TYy4bqqzf-I/AAAAAAAAAa4/npmT7Utl4Ng/s1600/cuneane_darts345_summary.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 262px;" src="http://4.bp.blogspot.com/-aaSowV7vk5s/TYy4bqqzf-I/AAAAAAAAAa4/npmT7Utl4Ng/s320/cuneane_darts345_summary.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5588044023109746658" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt; Which is just the same diagram, without all the numbers. Colored arrows in the faces show cycles of darts chosen as the bounding faces according to the same colored arrows joining maps. Note that the 4-cycle in CM3 chosen as the bounds for CM4 has to be flipped. So does the 5-cycle in CM3 chosen as the bounds of CM5. Perhaps there is still a transformation wrong here somewhere.&lt;div&gt;&lt;br /&gt;ANOTHER EDIT : After flipping CM5, and labelling the cycles (A-F/a-f : clockwise is uppercase, anticlockwise is lowercase) it is better.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/-qD0kxZUw004/TYzplEDZVgI/AAAAAAAAAbI/HGX0l38LS4E/s1600/cuneane_darts345_summary.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 263px;" src="http://4.bp.blogspot.com/-qD0kxZUw004/TYzplEDZVgI/AAAAAAAAAbI/HGX0l38LS4E/s320/cuneane_darts345_summary.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5588098060612359682" /&gt;&lt;/a&gt;The transitions are quite simple, really. Take a clockwise face, and make it anticlockwise (eg:  B-&gt;b, red arrow from CM3 to CM5) and make the anticlockwise outer face clockwise (d -&gt; D). These changes are on the red/purple arrows between maps. Under each map is a summary of the cycles, which really just shows that the outer face is anticlockwise and the others are clockwise.&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-4720170182643089029?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/4720170182643089029/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=4720170182643089029' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/4720170182643089029'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/4720170182643089029'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2011/03/final-post-on-combinatorial-maps-of.html' title='Final post on Combinatorial Maps of Cuneane'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-a25Gg3T1pyQ/TYy24yYEXLI/AAAAAAAAAaw/09wNujiqk-o/s72-c/cuneane_darts345.png' height='72' width='72'/><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-2482373731096574683</id><published>2011-03-23T14:36:00.007Z</published><updated>2011-03-23T16:08:50.599Z</updated><title type='text'>Cuneane Maps</title><content type='html'>So, to &lt;a href="http://gilleain.blogspot.com/2011/03/squashing-molecules-flat-and.html"&gt;continue&lt;/a&gt; about combinatorial maps, here are some more intricate diagrams. Firstly, an embedding of &lt;a href="http://gilleain.blogspot.com/2010/12/many-faces-of-fused-cycles.html"&gt;cuneane&lt;/a&gt; with a 4-cycle as the outer face:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/-7LALUVgBcUw/TYoF3-M1U8I/AAAAAAAAAaI/IcVx0LzMdV0/s1600/cuneane_outer4_map.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 222px;" src="http://1.bp.blogspot.com/-7LALUVgBcUw/TYoF3-M1U8I/AAAAAAAAAaI/IcVx0LzMdV0/s320/cuneane_outer4_map.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5587284746854290370" /&gt;&lt;/a&gt;&lt;br /&gt;The permutations below are just for reference. Anyway, with this embedding, the cycles of ϕ are (0,8,23,21,5)(1,2,11,7)(3,4,17,15)(6,12,9)(10,14,18,22,13)(16,20,19) which does indeed have one for each face, including the boundary. If you use a different embedding, naturally you get a different map:&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/-ngL3R1DfFVc/TYoLg4haAoI/AAAAAAAAAaQ/5nqfJUiiUns/s1600/cuneane_outer5_map.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 226px;" src="http://2.bp.blogspot.com/-ngL3R1DfFVc/TYoLg4haAoI/AAAAAAAAAaQ/5nqfJUiiUns/s320/cuneane_outer5_map.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5587290947262743170" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;Which is quite different, and has ϕ of (0,6,10,3)(1,4,20,22,9)(2,14,16,5)(7,8,13)(11,12,23,19,15)(17,18,21) - again, 6 cycles for the 6 rings. And one to rule them all, and in the darkness bind them, of course. The cycles of darts are shown in this composite image:&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/-KsXEgZ5IM_k/TYoawuHGMwI/AAAAAAAAAag/gl-ki1fZld4/s1600/cuneane_darts45.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 138px;" src="http://1.bp.blogspot.com/-KsXEgZ5IM_k/TYoawuHGMwI/AAAAAAAAAag/gl-ki1fZld4/s320/cuneane_darts45.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5587307712020361986" /&gt;&lt;/a&gt;&lt;br /&gt;Some of the triangular faces are missing the dart labels, as they were getting too crowded. Well, the whole thing is too crowded, but still. Highlighted in red in each are the darts corresponding to the outer face of the other embedding. Not sure what it means, though.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Finally, some references:&lt;div&gt;1) doi:10.1.1.149.3832 &lt;a href="http://en.scientificcommons.org/53590813"&gt;citeseer link&lt;/a&gt; - it's a dissertation, not a paper, but interesting.&lt;/div&gt;&lt;div&gt;2) &lt;span class="Apple-style-span"  style=" ;font-family:arial, sans-serif;"&gt;&lt;a href="http://www.springerlink.com/index/D8H12510202L0527.pdf" style="font-family: arial, sans-serif; color: rgb(0, 0, 204); "&gt;Signatures &lt;/a&gt;&lt;a href="http://www.springerlink.com/index/D8H12510202L0527.pdf" style="font-family: arial, sans-serif; color: rgb(0, 0, 204); "&gt;of &lt;/a&gt;&lt;a href="http://www.springerlink.com/index/D8H12510202L0527.pdf" style="font-family: arial, sans-serif; color: rgb(0, 0, 204); "&gt;combinatorial maps&lt;/a&gt; (direct link to pdf) &lt;/span&gt;S Gosselin, G Damiand, and Christine Solnon. Damiand is the author of some of the images on wikipedia about combinatorial maps...&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-2482373731096574683?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/2482373731096574683/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=2482373731096574683' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2482373731096574683'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2482373731096574683'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2011/03/cuneane-maps.html' title='Cuneane Maps'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-7LALUVgBcUw/TYoF3-M1U8I/AAAAAAAAAaI/IcVx0LzMdV0/s72-c/cuneane_outer4_map.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-2583432373521706732</id><published>2011-03-20T19:02:00.009Z</published><updated>2011-03-20T21:48:56.113Z</updated><title type='text'>Squashing Molecules Flat and Combinatorial Maps</title><content type='html'>So the traditional way to say it is probably 'planar embedding', but I've gone for the alternate terminology of 'squashing flat'. For many molecules, this is a fairly easy process : some decisions have to be made about rotatable bonds, but quite a few drugs are just things sticking off a benzene ring or two.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For fully 3-dimensional molecular graphs (&lt;a href="http://gilleain.blogspot.com/2010/12/many-faces-of-fused-cycles.html"&gt;for example&lt;/a&gt;) it is trickier. I don't yet know a simple algorithm for choosing different possible embeddings ... er ... squashings. Perhaps they are all complicated; they seem to involve tree data structures with names like "SPQR Tree" (see "Optimizing over all combinatorial embeddings of a planar graph").&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Far easier, though, is the intermediate data structure between a graph and some 2D coordinates - known as a &lt;a href="http://en.wikipedia.org/wiki/Combinatorial_map"&gt;combinatorial map&lt;/a&gt;. These mathematical objects store the detail of the embedding by recording the order of 'half bonds' called flags (or darts?) around a vertex. Maybe a flag is the vertex and the half-bond. Anyway, here is a picture of K4 (or a squashed tetrahedrane skeleton):&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/-JlfQaO7ow0g/TYZqEzngybI/AAAAAAAAAZo/Z7Dh92nYXjg/s1600/k4_map.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 209px;" src="http://4.bp.blogspot.com/-JlfQaO7ow0g/TYZqEzngybI/AAAAAAAAAZo/Z7Dh92nYXjg/s320/k4_map.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5586269018607634866" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;Now the left hand side just shows the graph with labelled vertices, while the one on the right shows the flags (I'm going to call them that now, sorry). At the bottom is the permutation σ applied to the set of flags (f). To understand σ consider just the central vertex (1) : if we travel clockwise round from flag 1, we get the flags (1, 8, 6). In fact, this is how I constructed the permutation : clockwise round all the vertices. It can be helpful to consider σ in &lt;a href="http://en.wikipedia.org/wiki/Cycle_notation"&gt;cycle notation&lt;/a&gt; as (0, 2, 4)(1, 6, 8)(5, 11, 9)(7, 10, 3) which has a cycle for each vertex.&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/-96Ixbb44oq4/TYZuDFBddiI/AAAAAAAAAZw/m272xBol4R4/s1600/K4_LR_embedding.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 217px;" src="http://2.bp.blogspot.com/-96Ixbb44oq4/TYZuDFBddiI/AAAAAAAAAZw/m272xBol4R4/s320/K4_LR_embedding.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5586273386966644258" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;This image shows two different embeddings of the same graph. Well, I suppose &lt;i&gt;technically &lt;/i&gt;it is different embeddings of the same labelling. In any case, L and R are flipped : if there wasn't so much symmetry it might be better. The same procedure as before  is used to get σL and σR (clockwise both times). Now check this out:&lt;/div&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/-FvrAdi2G60I/TYZz7LIkc6I/AAAAAAAAAaA/fsbwAoa3NxY/s1600/K4_phi_embedding.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 294px;" src="http://1.bp.blogspot.com/-FvrAdi2G60I/TYZz7LIkc6I/AAAAAAAAAaA/fsbwAoa3NxY/s320/K4_phi_embedding.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5586279848237888418" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;What is this crazy thing? Well, the other part of the combinatorial map is an 'involution' called α, which is really just another permutation that stores the pair of flags for each edge. In the images above, I have just ordered the edges using the vertices (as 0:1,0:2,0:3,1:2,1:3,2:3) then ordered the flags by vertex index : that is, α = [1, 0, 3, 2, 5, 4, 7, 6, 9, 8, 11, 10].&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Anyway, using the formula φ =  σ ⋅ α we get a permutation of what I actually am going to call 'darts'. This is all on the wikipedia page, but it should be clear that the labelled arrows on the image go in cycles. So there is (0, 6, 3) for example. Indeed φ = (0, 6, 3)(1, 4, 9)(2, 10, 5)(7, 8, 11) which is - of course - the four faces of the embedding, including the boundary face.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Phew! Fun stuff, and there is &lt;a href="https://github.com/gilleain/stereo/blob/master/src/stereo/CombinatorialMap.java"&gt;code here&lt;/a&gt; as well as some &lt;a href="https://github.com/gilleain/stereo/blob/master/src/test/CombinatorialMapTest.java"&gt;tests here&lt;/a&gt;. Well more like just System.out statements, but the beginning of tests...&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-2583432373521706732?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/2583432373521706732/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=2583432373521706732' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2583432373521706732'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2583432373521706732'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2011/03/squashing-molecules-flat-and.html' title='Squashing Molecules Flat and Combinatorial Maps'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-JlfQaO7ow0g/TYZqEzngybI/AAAAAAAAAZo/Z7Dh92nYXjg/s72-c/k4_map.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-8423935434634819835</id><published>2011-03-06T21:05:00.002Z</published><updated>2011-03-06T21:42:07.000Z</updated><title type='text'>Further work on PDB hetdict</title><content type='html'>&lt;a href="http://gilleain.blogspot.com/2011/02/atom-typing-hetgroup-dictionary.html"&gt;Previously...&lt;/a&gt; After using setFormalCharge instead of setCharge, some of the &lt;a href="http://gilleain.blogspot.com/2011/02/null-atomtypes-in-pdb-het-dictionary.html"&gt;null&lt;/a&gt; atom types disappeared. Specifically, the quaternary (SP3?) nitrogens, that have to have a +1 formal charge.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What is left? Mostly &lt;a href="https://github.com/gilleain/hetdict/blob/master/sorted_nulls_height_1.txt"&gt;FeS clusters, and other metals&lt;/a&gt;. Turns out that the height-1 signature is clearer for 'clustering' the atoms into types. This is because a signature of this height records only the immediate neighbours of the atom.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;One fairly frequent example is "[Co]([N][N][N][N])", but fortunately there is &lt;a href="http://sourceforge.net/tracker/?func=detail&amp;amp;aid=3201359&amp;amp;group_id=20024&amp;amp;atid=320024"&gt;a patch for this&lt;/a&gt; based on a bug report. The atom is found in &lt;a href="http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/pdbsum/GetPage.pl?pdbcode=n/a&amp;amp;template=het2pdb.html&amp;amp;param1=B12&amp;amp;s=2780712&amp;amp;o=HET"&gt;cobalamine&lt;/a&gt; and other cobalt-haemes.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Still, there are quite a few 'odd' ligands - either due to unusual elements, &lt;a href="http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/pdbsum/GetPage.pl?pdbcode=n/a&amp;amp;template=het2pdb.html&amp;amp;param1=IRI&amp;amp;s=12679814&amp;amp;o=OFFSET"&gt;like Iridium&lt;/a&gt;; or unusual coordinations, like &lt;a href="http://www.ebi.ac.uk/msd-srv/msdchem/cgi-bin/cgi.pl?FUNCTION=record&amp;amp;ENTITY=CHEM_COMP&amp;amp;PRIMARYKEY=ALB&amp;amp;PARENTINDEX=-1&amp;amp;APPLICATION=1"&gt;this iron with 6 oxygen neighbours&lt;/a&gt;. Hmmm, PDBeChem's image (at the bottom of the page) doesn't show all six bonds. It's clearer in &lt;a href="http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/pdbsum/GetPage.pl?pdbcode=n/a&amp;amp;template=het2pdb.html&amp;amp;param1=ALB&amp;amp;s=1667610&amp;amp;o=HET"&gt;Jmol from PDBSum&lt;/a&gt;, where the C3 symmetry of the iron ligands makes it seem (to me) that it is 6-coordinate.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Wow, that's a lot of links.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-8423935434634819835?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/8423935434634819835/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=8423935434634819835' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/8423935434634819835'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/8423935434634819835'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2011/03/further-work-on-pdb-hetdict.html' title='Further work on PDB hetdict'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-5750880961391914963</id><published>2011-02-08T20:12:00.003Z</published><updated>2011-02-08T20:20:42.478Z</updated><title type='text'>Null Atomtypes in PDB Het Dictionary now in some sort of RDF thing</title><content type='html'>This blog post won "Best Title" in the award for "Random Combinations of Words" category.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Anyway, the atomtypes that were not found by the CDKAtomMatcher are now down to just 1,400. There were various errors on my part (such as using element symbols like "CL" rather than "Cl') and CIF files are trickier to parse than I thought (an atom name can be "HOH'" or 'P"' - look closely!).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Just to make life even harder, and because Egon suggested it, I tried to get the output in some kind of &lt;a href="http://en.wikipedia.org/wiki/Resource_Description_Framework"&gt;RDF&lt;/a&gt; using &lt;a href="http://jena.sourceforge.net/"&gt;Jena&lt;/a&gt;. In practice, this seems to mean a sort of vague tree of nodes, some of which are data and some are types.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Well, its in the same repository as before, with source code. Although looking at the &lt;a href="https://github.com/gilleain/hetdict/blob/master/nulls.n3"&gt;N3 file&lt;/a&gt; it still looks like it would be impossible to work out which atom belongs to which hetgroup...&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-5750880961391914963?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/5750880961391914963/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=5750880961391914963' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/5750880961391914963'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/5750880961391914963'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2011/02/null-atomtypes-in-pdb-het-dictionary.html' title='Null Atomtypes in PDB Het Dictionary now in some sort of RDF thing'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-2621044668196680539</id><published>2011-02-07T18:36:00.003Z</published><updated>2011-02-07T19:46:09.522Z</updated><title type='text'>Atom-typing the Hetgroup Dictionary</title><content type='html'>So I posted this to CDK-devel, but probably this is the better place...&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I've been trying to make a map between the atom IDs used &lt;a href="http://deposit.pdb.org/cc_dict_tut.html"&gt;in the HET dictionary&lt;/a&gt; (which is in CIF format) and atom types of some sort. To see what this looks like, here is a tail of the file:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div style="text-align: left;"&gt;&lt;/div&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;div&gt;&lt;div style="text-align: left;"&gt;ZZZ.O6A:O.sp2&lt;/div&gt;&lt;div style="text-align: left;"&gt;ZZZ.H7C1:H&lt;/div&gt;&lt;div style="text-align: left;"&gt;ZZZ.H7C2:H&lt;/div&gt;&lt;div style="text-align: left;"&gt;ZZZ.H8:H&lt;/div&gt;&lt;div style="text-align: left;"&gt;ZZZ.H2N1:H&lt;/div&gt;&lt;div style="text-align: left;"&gt;ZZZ.H2N2:H&lt;/div&gt;&lt;div style="text-align: left;"&gt;ZZZ.H3:H&lt;/div&gt;&lt;div style="text-align: left;"&gt;ZZZ.H5:H&lt;/div&gt;&lt;div style="text-align: left;"&gt;ZZZ.H6:H&lt;/div&gt;&lt;div style="text-align: left;"&gt;ZZZ.H6A:H&lt;/div&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;div style="text-align: left;"&gt;'Zzzz', you may be thinking, but although many atom ids are quite obvious (like H8 is a hydrogen), some are probably not. One annoying aspect of this process was that the CIF file format is not especially friendly, and particularly, the file has 'loop_'s that don't terminate in octothorpes ('#'), as I thought they would.&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;Probably the parser (an IteratingCIFReader) could be much better written - in fact, it will probably only parse this one CIF! So my initial estimates of 13,000 typing failures is now down to only 3,508. What are the atoms that fail? There are some that are bound to &lt;a href="http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/pdbsum/GetPage.pl?pdbcode=n/a&amp;amp;template=het2pdb.html&amp;amp;param1=TBR&amp;amp;s=24820553&amp;amp;o=HET"&gt;like TBR&lt;/a&gt;, which is decidedly a non-organic molecule.&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;Quite a few, however, are aromatic chlorines:&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;div style="text-align: left;"&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;div style="text-align: left;"&gt;Null type for 00A CL4A [CL]([C]([C]=[C]))&lt;/div&gt;&lt;div style="text-align: left;"&gt;Null type for 014 CL [CL]([C]([C]=[C]))&lt;/div&gt;&lt;div style="text-align: left;"&gt;Null type for 01A CL4A [CL]([C]([C]=[C]))&lt;/div&gt;&lt;div style="text-align: left;"&gt;Null type for 01W N [N]([C]([C][C][H])[H][H][H])&lt;/div&gt;&lt;div style="text-align: left;"&gt;Null type for 024 BR19 [BR]([C]([C]=[C]))&lt;/div&gt;&lt;div style="text-align: left;"&gt;Null type for 032 CL13 [CL]([C](=[C][C]))&lt;/div&gt;&lt;div style="text-align: left;"&gt;Null type for 039 CL [CL]([C](=[C][C]))&lt;/div&gt;&lt;div style="text-align: left;"&gt;Null type for 055 CL1 [CL]([C](=[C][C]))&lt;/div&gt;&lt;div style="text-align: left;"&gt;Null type for 062 CL1 [CL]([C]([C]=[C]))&lt;/div&gt;&lt;div style="text-align: left;"&gt;Null type for 064 CL13 [CL]([C]([C]=[C]))&lt;/div&gt;&lt;div style="text-align: left;"&gt;Null type for 064 CL15 [CL]([C](=[C][C]))&lt;/div&gt;&lt;div style="text-align: left;"&gt;Null type for 064 CL25 [CL]([C](=[C][C]))&lt;/div&gt;&lt;div style="text-align: left;"&gt;Null type for 088 CL32 [CL]([C](=[C][C]))&lt;/div&gt;&lt;div style="text-align: left;"&gt;Null type for 088 CL37 [CL]([C]([C]=[C]))&lt;/div&gt;&lt;/blockquote&gt;&lt;div style="text-align: left;"&gt;&lt;/div&gt;&lt;div&gt;The last part is the height-2 signature, just to give a quick idea of the environment of the atom. Aha! Some quick "cut | sort | uniq -c | sort -n" gives me:&lt;/div&gt;&lt;div&gt;&lt;div&gt; &lt;/div&gt;&lt;div&gt; &lt;/div&gt;&lt;blockquote&gt;&lt;div&gt;24 [BR]([C]([C][H][H]))&lt;/div&gt;&lt;div&gt;  29 [CL]([C](=[C][N]))&lt;/div&gt;&lt;div&gt;  29 [CL]([C](=[C][S]))&lt;/div&gt;&lt;div&gt;  38 [N]([C]([C][C][H])[H][H][H])&lt;/div&gt;&lt;div&gt;  42 [N]([C]([C][H][H])[H][H][H])&lt;/div&gt;&lt;div&gt;  48 [CL]([C]([C][H][H]))&lt;/div&gt;&lt;div&gt;  75 [N]([C]([C][H][H])[C]([H][H][H])[C]([H][H][H])[C]([H][H][H]))&lt;/div&gt;&lt;div&gt; 337 [BR]([C]([C]=[C]))&lt;/div&gt;&lt;div&gt;1176 [CL]([C]([C]=[C]))&lt;/div&gt;&lt;/blockquote&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;gives the 'top-10' worst offenders. That nitrogen one is N(CH3)4 - perhaps charge is a problem here? &lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-2621044668196680539?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/2621044668196680539/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=2621044668196680539' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2621044668196680539'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2621044668196680539'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2011/02/atom-typing-hetgroup-dictionary.html' title='Atom-typing the Hetgroup Dictionary'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-9061073037503436324</id><published>2010-12-20T20:52:00.004Z</published><updated>2010-12-20T23:06:25.814Z</updated><title type='text'>Final example for today : bowtieane</title><content type='html'>The simplest-yet-most-complex example I could find is this one &lt;a href="http://gilleain.blogspot.com/2010/05/chemicals-as-colored-graphs.html"&gt;from another previous post&lt;/a&gt; (I like to recycle :)&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/TQ_hL4GtIEI/AAAAAAAAAZY/_n3bquCgO-s/s1600/bowtieane_faces.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 216px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/TQ_hL4GtIEI/AAAAAAAAAZY/_n3bquCgO-s/s320/bowtieane_faces.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5552904459726430274" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;I should mention again that I don't know if this is an actual chemical. I call it 'bowtieane', but maybe it has a proper name. In any case, it makes a nice small test case. The ring equivalence classes shown (A, B, C) are defined by the signatures as usual.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What's really strange though is SSSRFinder's partition of the rings. It divides the two A-rings into separate classes. This seems like the opposite behaviour to the &lt;a href="http://gilleain.blogspot.com/2010/12/fullerene-26-revisited-more-vertices.html"&gt;fullerene-26 case&lt;/a&gt; where there were fewer equivalence classes than for the signature method. I really, really should read the papers referenced in the code.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-9061073037503436324?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/9061073037503436324/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=9061073037503436324' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/9061073037503436324'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/9061073037503436324'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/12/final-example-for-today-bowtieane.html' title='Final example for today : bowtieane'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_eL4nhrOF5R4/TQ_hL4GtIEI/AAAAAAAAAZY/_n3bquCgO-s/s72-c/bowtieane_faces.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-2958081342072709709</id><published>2010-12-20T18:48:00.003Z</published><updated>2010-12-20T19:01:08.596Z</updated><title type='text'>Fullerene-26 revisited : more vertices, less complexity</title><content type='html'>There is an example graph that makes for a better comparison between &lt;a href="https://github.com/gilleain/layout"&gt;my code&lt;/a&gt; and the SSSRFinder. It is the 'fullerene-26' molecule that I used &lt;a href="http://gilleain.blogspot.com/2010/05/fullerene-symmetries.html"&gt;in a previous post&lt;/a&gt;. Unfortunately the diagram is wrong in that post; the one below has the correct vertex equivalence classes (colors):&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/TQ-mD7aemSI/AAAAAAAAAZI/ZFXK7UOEAbE/s1600/fullerene_26_faces.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 220px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/TQ-mD7aemSI/AAAAAAAAAZI/ZFXK7UOEAbE/s320/fullerene_26_faces.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5552839451989678370" /&gt;&lt;/a&gt;&lt;br /&gt;The difference is just a pair of yellows that should have been pink (and v.v.) in the topmost C ring. Anyway, the graph is complex enough that there are different vertex symmetry classes without multiple bonds. Dodecahedrane, on the other hand, is so symmetric that without double bonds all vertices are the same.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So how does the SSSRfinder do? Well it considers all the rings to be different apart from two of the B rings. It is not reporting the final B ring, which could be considered as the bounding face of the map. In short, there are two equivalence classes (5-ring and 6-ring) which makes some kind of sense but isn't terribly informative.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-2958081342072709709?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/2958081342072709709/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=2958081342072709709' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2958081342072709709'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2958081342072709709'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/12/fullerene-26-revisited-more-vertices.html' title='Fullerene-26 revisited : more vertices, less complexity'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_eL4nhrOF5R4/TQ-mD7aemSI/AAAAAAAAAZI/ZFXK7UOEAbE/s72-c/fullerene_26_faces.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-601167023832822492</id><published>2010-12-20T16:43:00.004Z</published><updated>2010-12-20T17:23:14.165Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='signatures'/><category scheme='http://www.blogger.com/atom/ns#' term='polyhedra'/><title type='text'>Dodecahedrane has 12 faces, right?</title><content type='html'>So there was this guy called Euler, and he had a formula that goes something like F = E - V + 2. Well, actually it is χ = V - E + F, where χ is the &lt;a href="http://en.wikipedia.org/wiki/Euler_characteristic"&gt;Euler characteristic&lt;/a&gt;, and this is equal to 2 for polyhedra. Anyway, the point is that dodecahedrane has 12 faces (cycles).&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For the SSSRFinder, however, it has only 11; which is annoying. Moreover the ring equivalence class method only distinguishes based on the underlying simple graph - in other words it ignores bond order. In some applications this might be exactly what is needed, but I'm glad that my method gives a more detailed result:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/TQ-QxY-3m_I/AAAAAAAAAZA/8IDLzj-YLcU/s1600/dodecahedrane_faces.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 224px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/TQ-QxY-3m_I/AAAAAAAAAZA/8IDLzj-YLcU/s320/dodecahedrane_faces.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5552816043765242866" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;So, apart from being a ridiculously detailed image, the above shows the face (ring, cycle) equivalence classes for &lt;a href="http://gilleain.blogspot.com/2010/05/fullerene-symmetries.html"&gt;dodecahedrane with a particular double bond network&lt;/a&gt;. Clearly any face could be 'glued' to another along one of the edges, following the vertex classes. All possible combinations of faces are shown in the 'face quotient graph' at the bottom right.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-601167023832822492?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/601167023832822492/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=601167023832822492' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/601167023832822492'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/601167023832822492'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/12/dodecahedrane-has-12-faces-right.html' title='Dodecahedrane has 12 faces, right?'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_eL4nhrOF5R4/TQ-QxY-3m_I/AAAAAAAAAZA/8IDLzj-YLcU/s72-c/dodecahedrane_faces.png' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-6583155950171133360</id><published>2010-12-19T22:21:00.005Z</published><updated>2010-12-19T23:06:35.972Z</updated><title type='text'>SSSRFinder and Ring Equivalence Classes</title><content type='html'>With hubris and arrogance, I implemented my own ring equivalence class finder. With humility and grace, the SSSRFinder method getRingEquivalenceClasses does better for at least one case:&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_eL4nhrOF5R4/TQ6PIIi7ecI/AAAAAAAAAYw/S_N8ZQvi3Ak/s1600/sssrfinder_vs_signatures.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 197px;" src="http://4.bp.blogspot.com/_eL4nhrOF5R4/TQ6PIIi7ecI/AAAAAAAAAYw/S_N8ZQvi3Ak/s320/sssrfinder_vs_signatures.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5552532760490113474" /&gt;&lt;/a&gt;&lt;br /&gt;This is the test case used in SimpleCycleBasisTest, and probably in the paper on the subject. I guess I should read that instead of other papers on cycle bases. I do get the same answer for cuneane as when I do it &lt;a href="http://gilleain.blogspot.com/2010/12/many-faces-of-fused-cycles.html"&gt;by hand&lt;/a&gt;. That is to say, 3 equivalence classes with [1, 2, 2] elements.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For the 'Bauer' graph above (as I am currently calling it - Ulrich Bauer is the author of the SSSRFinder) the signature method puts almost every vertex in its own equivalence class. Now my method makes equivalence classes from cycles with the vertices in the same vertex equivalence class. Naturally enough, this makes a lot of ring equivalence classes.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The graph on the left has vertices colored by degree, to show how dissimilar the rings are. Is the SSSRFinder really getting the right answer?&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-6583155950171133360?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/6583155950171133360/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=6583155950171133360' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/6583155950171133360'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/6583155950171133360'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/12/sssrfinder-and-ring-equivalence-classes.html' title='SSSRFinder and Ring Equivalence Classes'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_eL4nhrOF5R4/TQ6PIIi7ecI/AAAAAAAAAYw/S_N8ZQvi3Ak/s72-c/sssrfinder_vs_signatures.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-3825419790275287811</id><published>2010-12-18T21:30:00.004Z</published><updated>2010-12-18T22:01:04.946Z</updated><title type='text'>Herschel graph with different planar embeddings</title><content type='html'>Another example from a paper that &lt;a href="http://gilleain.blogspot.com/2010/05/non-chemical-example-grinbergs-graph.html"&gt;I have mentioned before&lt;/a&gt;. This time the &lt;a href="http://en.wikipedia.org/wiki/Herschel_graph"&gt;Herschel graph&lt;/a&gt; which is another of these crazy graphs thought up to prove or disprove some conjecture. The spring-layout paper gives two different layouts:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/TQ0pDj7v5FI/AAAAAAAAAYQ/OccJD3Ggxok/s1600/herschel.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 206px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/TQ0pDj7v5FI/AAAAAAAAAYQ/OccJD3Ggxok/s320/herschel.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5552139056779879506" /&gt;&lt;/a&gt;&lt;p&gt;These really are the same graph :) One way to see this is to look at the vertex colors that I have applied. Each face is given a label (A or B) based on the key shown below the two graphs. The 'A' face is {Black, White, Grey, White} for example. The left-hand embedding (or layout) has such an A-face for its border, while the right-hand one has a B-face for a border.&lt;/p&gt;&lt;p&gt;Presumably, it is possible to convert a vertex partition (into equivalence classes) into a partition of faces. It seems easy for examples - like this - that have a planar embedding. More difficult for graphs that don't have one. On the other hand, not all planar layouts look very informative:&lt;/p&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/TQ0uU62av1I/AAAAAAAAAYY/R_CZEEPUbFo/s1600/twistane_planar.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 199px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/TQ0uU62av1I/AAAAAAAAAYY/R_CZEEPUbFo/s320/twistane_planar.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5552144852547452754" /&gt;&lt;/a&gt;&lt;p&gt;This is twistane (again) but not looking as symmetric as it can. However, the faces show the regularity - they are all the same, even the boundary. The colors used are the same as in the &lt;a href="http://gilleain.blogspot.com/2010/12/many-faces-of-fused-cycles.html"&gt;previous post&lt;/a&gt;.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-3825419790275287811?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/3825419790275287811/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=3825419790275287811' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/3825419790275287811'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/3825419790275287811'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/12/herschel-graph-with-different-planar.html' title='Herschel graph with different planar embeddings'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_eL4nhrOF5R4/TQ0pDj7v5FI/AAAAAAAAAYQ/OccJD3Ggxok/s72-c/herschel.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-6566964695805946856</id><published>2010-12-16T19:29:00.004Z</published><updated>2010-12-16T20:11:28.865Z</updated><title type='text'>The many faces of fused cycles</title><content type='html'>Although many ring systems in molecules are quite easy to layout as 2D diagrams, there are some that are inherently 3D. Bridged rings are usually in this class; consider my favourite example molecule, cuneane:&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/TQprMrlMvfI/AAAAAAAAAYA/bnopvsZYTsY/s1600/faces_of_cuneane.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 143px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/TQprMrlMvfI/AAAAAAAAAYA/bnopvsZYTsY/s320/faces_of_cuneane.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5551367356288908786" /&gt;&lt;/a&gt;Each of (A, B, C) is a particular layout of the same molecule, but with a different boundary (hexagon, pentagon, er...kind of fused squares). It would be nice to have a layout method that picked the same choice each time - regardless of the permutation of atoms and bonds. Even better if it could allow enumeration of the alternative possibilities.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;As another example, consider a series based on twistane (which is a molecule) to two other graphs that may well not be actual molecules:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_eL4nhrOF5R4/TQpxO1-8GpI/AAAAAAAAAYI/mQvsbxyZm_M/s1600/twistane_series.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 177px;" src="http://4.bp.blogspot.com/_eL4nhrOF5R4/TQpxO1-8GpI/AAAAAAAAAYI/mQvsbxyZm_M/s320/twistane_series.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5551373990510729874" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;Twistane itself is in the middle, surrounded by five- and seven- ring equivalents. The upper layouts emphasise one ring in the graph while the lower ones emphasise the dual rings in each.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-6566964695805946856?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/6566964695805946856/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=6566964695805946856' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/6566964695805946856'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/6566964695805946856'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/12/many-faces-of-fused-cycles.html' title='The many faces of fused cycles'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_eL4nhrOF5R4/TQprMrlMvfI/AAAAAAAAAYA/bnopvsZYTsY/s72-c/faces_of_cuneane.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-8084545742028604592</id><published>2010-12-15T14:30:00.004Z</published><updated>2010-12-15T15:27:31.647Z</updated><title type='text'>Why modular decomposition is not very useful for chemical graphs</title><content type='html'>It is difficult to publish negative results in a journal, but a blog post seems like a good place to record the experience. Especially situations like this, where it probably should have been obvious not to try in the first place...&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So; what is &lt;a href="http://en.wikipedia.org/wiki/Modular_decomposition"&gt;modular decomposition&lt;/a&gt;? Briefly, a module is a little like a connected component in a graph - indeed, a connected component is made up from one or more modules, but modules can overlap. Decomposition of a graph into its modules is, therefore, like finding the connected components of the graph. An example is shown here:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_eL4nhrOF5R4/TQjcBUQI2WI/AAAAAAAAAXw/JTdckcwslys/s1600/modular_graph.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 285px;" src="http://4.bp.blogspot.com/_eL4nhrOF5R4/TQjcBUQI2WI/AAAAAAAAAXw/JTdckcwslys/s320/modular_graph.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5550928455908514146" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Two modules in the graph are circled, there may be others. The definition is a set of vertices that have the same neighbours outside the set. So, there was no need for me to make them complete graphs, but it looked nicer. Anyway, already looking at this example it is clear that these are not very 'chemical' graphs. They look more like networks (for example, see : [1]).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Indeed, I tried out some code made by the authors of a paper on a linear-time algorithm for modular decomposition [2] that was in java (makes a change from c). It was also nicely commented, unlike the mathematician - whose name I won't mention - that said he doesn't comment or document his code because "no program has ever improved through comments" which is just a lazy excuse, frankly.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The results for the molecules in the CDK MoleculeFactory were that almost all of them are prime modules; which means that they are elementary, or unbreakdownable. Notable exceptions are cyclobutane and a propellane-like graph (see image, modules are circled).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/TQjcQCCmUpI/AAAAAAAAAX4/4pzJ7Eb2XdI/s1600/cyclobut_prop_modules.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 146px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/TQjcQCCmUpI/AAAAAAAAAX4/4pzJ7Eb2XdI/s320/cyclobut_prop_modules.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5550928708717925010" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In fact, I suspect that chemical graphs with non-prime modular decomposition trees are rare. Partly because most graphs are irregular, but mainly due to the low degree (valence) of atomic vertices. Anyway, modules are not a solution to structure diagram layout [3].&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;[1] : J. Gagneur, R. Krause, T. Bouwmeester, and G. Casari. &lt;b&gt;&lt;span class="Apple-style-span"  style=" ;font-family:arial, sans-serif;"&gt;&lt;a href="http://www.biomedcentral.com/1465-6906/5/R57/abstract" style="font-family: arial, sans-serif; color: rgb(0, 0, 204); "&gt;&lt;span class="Apple-style-span" style="font-weight: normal;"&gt;Modular decomposition of protein-protein interaction networks&lt;/span&gt;&lt;/a&gt;&lt;/span&gt;. &lt;/b&gt;Genome Biology 5:R57 (2004). doi:10.1186/gb-2004-5-8-r57&lt;/div&gt;&lt;div&gt;[2] : Marc Tedder, Derek Corneil, Michel Habib, Christophe Paul Simple. &lt;span class="Apple-style-span"  style=" ;font-family:arial, sans-serif;"&gt;&lt;a href="http://www.springerlink.com/index/J64J478577633842.pdf" style="font-family: arial, sans-serif; color: rgb(0, 0, 204); "&gt;Simpler linear-time modular decomposition via recursive factorizing permutations&lt;/a&gt;. &lt;/span&gt;DOI:10.1007/978-3-540-70575-8_52&lt;/div&gt;&lt;div&gt;[3] &lt;span class="Apple-style-span"   style="  ;font-family:Times;font-size:14px;"&gt;&lt;/span&gt;&lt;span&gt;&lt;span&gt;Drawing Graphs Using Modular Decomposition DOI: 10.1007/11618058_31&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"   style="  color: rgb(153, 153, 153); line-height: 17px; font-family:Myriad, 'Trebuchet MS', sans-serif;font-size:12px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-8084545742028604592?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/8084545742028604592/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=8084545742028604592' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/8084545742028604592'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/8084545742028604592'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/12/why-modular-decomposition-is-not-very.html' title='Why modular decomposition is not very useful for chemical graphs'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_eL4nhrOF5R4/TQjcBUQI2WI/AAAAAAAAAXw/JTdckcwslys/s72-c/modular_graph.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-2443686360492735808</id><published>2010-09-29T18:43:00.003+01:00</published><updated>2010-09-29T18:47:13.586+01:00</updated><title type='text'>Muhahahaha! Things can always be more complex...</title><content type='html'>A good measure of how right a model or an implementation is can be how quickly it extends to more complex situations:&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/TKN7KG-c42I/AAAAAAAAAXo/_5uNqfWKFQQ/s1600/rose_forest.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 219px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/TKN7KG-c42I/AAAAAAAAAXo/_5uNqfWKFQQ/s320/rose_forest.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5522392981687821154" /&gt;&lt;/a&gt;This is essentially the same except that 'SSE' has been added (no big deal) but also the leaf list has been generalised to AbstractLeafCollection. :)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I'm sure there are better ways to do this, but it fits neatly with some existing ideas I had on searching through lists vs searching through sets.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-2443686360492735808?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/2443686360492735808/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=2443686360492735808' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2443686360492735808'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2443686360492735808'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/09/muhahahaha-things-can-always-be-more.html' title='Muhahahaha! Things can always be more complex...'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_eL4nhrOF5R4/TKN7KG-c42I/AAAAAAAAAXo/_5uNqfWKFQQ/s72-c/rose_forest.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-7125891596540046422</id><published>2010-09-29T17:42:00.004+01:00</published><updated>2010-09-29T18:01:00.818+01:00</updated><title type='text'>Rose Forests</title><content type='html'>Carl Masak &lt;a href="http://strangelyconsistent.org/blog/its-just-a-tree-silly"&gt;blogged&lt;/a&gt; about tree data structures, which caught my interest because of a pet-project of mine (&lt;a href="http://github.com/gilleain/tailor"&gt;tailor&lt;/a&gt;; a structure description and measurement tool) where I found myself using trees a lot. An awful lot. Perhaps ... &lt;i&gt;too much.&lt;/i&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Anyway, a related tweet by AudreyT mentioned an article called "Origami Programming" by Jeremy Gibbons. Which is in haskell (perhaps not surprisingly), a language I don't speak very well. However, while reading - and not understanding it - I did get one thing which was the idea of having a tree datatype where the node (called a 'rose') references a forest (a list of roses). I think that's right.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In any case, it solves a object-modelling problem for me that I had. The difficulty was that protein structures are hierarchical, yes, but have a strange mixed hierarchy of types. Perhaps this is obvious to haskell programmers and compiler-code writers, but this makes it very difficult to use the 'simple' tree datatype, where a Node class has a List of Node children.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Specifically, I mean situations like: a Chain composed of Atoms or a Chain of SSEs of Residues of Atoms. The 'rose tree' way of doing things makes this possible, at the price of a more complex model. Now here is a picture of a sketch of it:&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_eL4nhrOF5R4/TKNvJUO1f4I/AAAAAAAAAXg/O9FfM47MX4c/s1600/rose_forest.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 191px;" src="http://4.bp.blogspot.com/_eL4nhrOF5R4/TKNvJUO1f4I/AAAAAAAAAXg/O9FfM47MX4c/s320/rose_forest.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5522379773926801282" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So, for example, you can have a Protein:ChainList(Chain:AtomList(Atom),Chain:ResidueList(Residue...)) or several other possibilities. Also a visitor to the hierarchy can do separate things to a Tree than to a LeafList. Neat! Oh, and the code for the implementation (just the bare bones, not usable) &lt;a href="http://github.com/gilleain/roseforest"&gt;is here.&lt;/a&gt; &lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-7125891596540046422?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/7125891596540046422/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=7125891596540046422' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/7125891596540046422'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/7125891596540046422'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/09/rose-forests.html' title='Rose Forests'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_eL4nhrOF5R4/TKNvJUO1f4I/AAAAAAAAAXg/O9FfM47MX4c/s72-c/rose_forest.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-1867178785572543332</id><published>2010-09-20T20:15:00.005+01:00</published><updated>2010-09-20T20:49:06.055+01:00</updated><title type='text'>Molecule Layouts</title><content type='html'>I've been doing experimental work on layouts for the CDK. Not for atoms, exactly, for which the StructureDiagramGenerator is doing a pretty good job - could be better, of course, but what couldn't?&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;No, layout of MoleculeSets, and Reactions. Well actually IMoleculeSets and IReactions. With an ILayout&lt;t&gt; class - my apologies to anyone who doesn't like generics, but it can be quite useful. Anyway, here is an example of what it is looking like at the moment:&lt;/t&gt;&lt;/div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/TJe0T_dg4fI/AAAAAAAAAXQ/w4t8QJcd03I/s1600/testimage_three_by_three.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 320px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/TJe0T_dg4fI/AAAAAAAAAXQ/w4t8QJcd03I/s320/testimage_three_by_three.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5519078123911569906" /&gt;&lt;/a&gt;&lt;div&gt;Hmmm. Well, it is a grid I suppose. The problems with the ring bonds are known to me, please do not mention them &gt;:|&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The code for this is quite short:&lt;/div&gt;&lt;pre class="brush: java"&gt;IMoleculeSet moleculeSet = makeMolSet();&lt;br /&gt;ILayout&amp;lt;IMoleculeSet&amp;gt; gridLayout = new GridMoleculeSetLayout(3, 3);&lt;br /&gt;makeImage(moleculeSet, gridLayout, "three_by_three", 500, 500);&lt;br /&gt;&lt;/pre&gt;where the methods 'makeMolSet' and 'makeImage' do what you might expect (I hope :). Similarly:&lt;pre class="brush: java"&gt;IMoleculeSet molSet = makeMolSet();&lt;br /&gt;AxisOrientation o = AxisOrientation.PLUS_X_PLUS_Y;&lt;br /&gt;ILayout&amp;lt;IMoleculeSet&amp;gt; layout = new LinearMoleculeSetLayout(new StandardMoleculeLayout(), 3, o);&lt;br /&gt;makeImage(molSet, layout, "xypos", 500, 500);&lt;/pre&gt;and this produces an image like this:&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/TJe4HDfcj0I/AAAAAAAAAXY/zed1ggWUHhs/s1600/testimage_xypos.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 320px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/TJe4HDfcj0I/AAAAAAAAAXY/zed1ggWUHhs/s320/testimage_xypos.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5519082299701628738" /&gt;&lt;/a&gt;which is also ... alright. Getting there.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-1867178785572543332?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/1867178785572543332/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=1867178785572543332' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/1867178785572543332'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/1867178785572543332'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/09/molecule-layouts.html' title='Molecule Layouts'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_eL4nhrOF5R4/TJe0T_dg4fI/AAAAAAAAAXQ/w4t8QJcd03I/s72-c/testimage_three_by_three.png' height='72' width='72'/><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-2695862289536922981</id><published>2010-09-12T16:36:00.004+01:00</published><updated>2010-09-12T16:46:15.710+01:00</updated><title type='text'>Consistent Zoom with Models of Different Scales</title><content type='html'>So there is a way to get the zoom to work:&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/TIzz1_xINMI/AAAAAAAAAXI/VTPsfsMHkrw/s1600/double_zoom.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 103px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/TIzz1_xINMI/AAAAAAAAAXI/VTPsfsMHkrw/s320/double_zoom.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5516051752598123714" /&gt;&lt;/a&gt;(to zoom on the picture, click for bigger :)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The approach taken here is to create graphical objects (LineElement, RectangleElement, etc) that are scaled at the origin, but not zoomed or translated to the center of the draw area. These last two parts of the transform are then added to the graphics transform.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;One downside of altering the transform in the graphics is that if we want to draw extra stuff on the panel (like the detail string "Zoom = x, Scale = y" in the picture above) the original transform has to be captured before drawing, then restored after.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For example,  see this commit:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;http://github.com/gilleain/toyrenderer/commit/c3bad966cd37c604f2ab4eb0e177603e88bee2f8&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-2695862289536922981?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/2695862289536922981/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=2695862289536922981' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2695862289536922981'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2695862289536922981'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/09/consistent-zoom-with-models-of.html' title='Consistent Zoom with Models of Different Scales'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_eL4nhrOF5R4/TIzz1_xINMI/AAAAAAAAAXI/VTPsfsMHkrw/s72-c/double_zoom.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-6653702463236766766</id><published>2010-09-11T15:21:00.005+01:00</published><updated>2010-09-11T15:52:20.408+01:00</updated><title type='text'>Scaling and Text</title><content type='html'>An obvious question about the CDK rendering code is : "Why not scale text with AffineTransform?" So, of course this is possible, and works quite nicely - but there is a cost.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;One of the goals of the rendering code is to start from models of any scale, and render them as  consistently sized diagrams on screen. By 'scale' here I simply mean the average distance between points. So the CDK layout code might use a distance of (say) 1 between two carbon atoms, but a file with a structure made in some other chemical editor might have an average atom-atom distance of 100. These are unitless values, by the way.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Now, what we &lt;i&gt;could&lt;/i&gt; have done was transform the coordinates in the model to a consistent scale, then rendered these transformed coordinates. What we chose to do, however, was to calculate a single transform for the model and draw with this. If you use this transform to scale the graphics object before drawing you get this for a model scale of 10:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/TIuWF3VK1ZI/AAAAAAAAAW4/CiXhSobESpY/s1600/linear_scale10.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 206px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/TIuWF3VK1ZI/AAAAAAAAAW4/CiXhSobESpY/s320/linear_scale10.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5515667196141557138" /&gt;&lt;/a&gt;and this for a model scale of 100 (in other words a 'bond length' of 100):&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/TIuWmPWHcZI/AAAAAAAAAXA/0lgGBzDBgec/s1600/linear_scale100.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 206px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/TIuWmPWHcZI/AAAAAAAAAXA/0lgGBzDBgec/s320/linear_scale100.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5515667752343794066" /&gt;&lt;/a&gt;the images are at slightly different zooms, but the point is clear I think. The choice is between scaling fonts independently, and altering the model directly.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-6653702463236766766?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/6653702463236766766/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=6653702463236766766' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/6653702463236766766'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/6653702463236766766'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/09/scaling-and-text.html' title='Scaling and Text'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_eL4nhrOF5R4/TIuWF3VK1ZI/AAAAAAAAAW4/CiXhSobESpY/s72-c/linear_scale10.png' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-5671501021154911294</id><published>2010-09-05T19:31:00.004+01:00</published><updated>2010-09-05T19:38:51.328+01:00</updated><title type='text'>Generic Rendering</title><content type='html'>Egon++ is continuing the process of merging the CDK-JCP rendering core into CDK master. Some proposed generification of the classes was made on the mailing list, and here is a sketch of some of the classes and interfaces:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/TIPiiUaS-OI/AAAAAAAAAWw/s4v447xIjq0/s1600/generic_rendering.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 170px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/TIPiiUaS-OI/AAAAAAAAAWw/s4v447xIjq0/s320/generic_rendering.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5513499448054053090" /&gt;&lt;/a&gt;I realise that this looks horribly complex, but the question is : "Is it just complex enough, or too complex?". One of the things missing from the diagram is layout - there may be a need for classes like LinearMoleculeSetLayout or GridMoleculeSetLayout. Oh, and yes (you guessed it!) an IMoleculeSetLayout and ChemObjectLayout classes :)&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The goal here is not to make convoluted code, but to avoid repeating stuff. A reaction renderer should know how to layout and paint molecule sets, and then pass on the task to the molecule set renderer, and so on. Some key things to avoid will be a) not to relayout on each paint and b) generate the diagrams in the right places, at the correct scales. I think that this will be possible.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-5671501021154911294?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/5671501021154911294/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=5671501021154911294' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/5671501021154911294'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/5671501021154911294'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/09/generic-rendering.html' title='Generic Rendering'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_eL4nhrOF5R4/TIPiiUaS-OI/AAAAAAAAAWw/s4v447xIjq0/s72-c/generic_rendering.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-726754023281805253</id><published>2010-09-03T11:45:00.005+01:00</published><updated>2010-09-03T12:34:14.732+01:00</updated><title type='text'>How (not) to remove items from a (CDK) list</title><content type='html'>So there I was, trying to remove all mappings from a Reaction like this:&lt;br /&gt;&lt;pre class="brush: java"&gt;for (int i = 0; i &amp;lt; reaction.getMappingCount(); i++) { reaction.removeMapping(i); }&lt;/pre&gt;and found that only half the mappings were being removed ... can you see why? :)&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In fact, this is not some obscure CDK bug, but a logic error on my part. Equivalent code is this:&lt;/div&gt;&lt;div&gt;&lt;pre class="brush: java"&gt;List list = getListSomehow();&lt;br /&gt;for (int i = 0; i &amp;lt; list.size(); i++) { list.remove(i) }&lt;/pre&gt;using for example a java.util.ArrayList. The problem is that the index (&lt;b&gt;i&lt;/b&gt;) is being tested against a changing number (the size). Once half the items have been removed, &lt;b&gt;i&lt;/b&gt; is at the half-way point, so on the next pass it stops.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;One way to 'solve' this is to go backwards:&lt;/div&gt;&lt;pre class="brush: java"&gt;for (int i = list.size(); i &amp;gt; 0; i--) { reaction.removeMapping(i); }&lt;/pre&gt;but this is slightly less clear than just :&lt;br /&gt;&lt;pre class="brush: java"&gt;List list = getListSomehow();&lt;br /&gt;int size = list.size();&lt;br /&gt;for (int i = 0; i &amp;lt; size; i++) { list.remove(i) }&lt;/pre&gt;which is clearer. Of course, even better is the List method removeAll(). It would be nice if Reaction had a similar method...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-726754023281805253?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/726754023281805253/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=726754023281805253' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/726754023281805253'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/726754023281805253'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/09/how-not-to-remove-items-from-cdk-list.html' title='How (not) to remove items from a (CDK) list'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-8646072032027285013</id><published>2010-08-15T20:15:00.003+01:00</published><updated>2010-08-15T20:49:09.333+01:00</updated><title type='text'>Combinations and Filters</title><content type='html'>So there is now the beginning of a possible re-write of the &lt;a href="http://gilleain.blogspot.com/2010/08/14-benzoquinone-and-deducebondsystemsto.html"&gt;DBST&lt;/a&gt; that uses basically the same approach, but is a bit more flexible. The code is &lt;a href="http://github.com/gilleain/doublebonds"&gt;here&lt;/a&gt;, but it's still a bit rough.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The original idea seems to have been to encode arrangements of double bonds for different ring sizes as a kind of  'library'. For each ring, a particular arrangement is picked until all possible combinations are generated. As a concrete example, see this example for a napthalene skeleton:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/TGhBwJ6KqmI/AAAAAAAAAWg/aaOmIDXYINU/s1600/double_bond_systems.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 273px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/TGhBwJ6KqmI/AAAAAAAAAWg/aaOmIDXYINU/s320/double_bond_systems.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5505722840009845346" /&gt;&lt;/a&gt;Here, the arrangements (1, 2) are applied to each ring (A, B) and then these are combined. Of the four combinations (A1B1, A2B1, A1B2, A2B2) only three are valid. The A1B2 combination has two atoms highlighted in red that have two double bonds and one single bond.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So one way to filter the combinations is to try and type the atoms, and reject any structure that has untypeable atoms. Another possible filter rejects structures that don't have atoms that are SP2 hybridized. Both of these are from the original code, but implemented as instances of a ChemicalFilter interface.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This is quite similar - uncoincidentally-  to the approach in &lt;a href="http://github.com/asad/cdk-smsd/tree/master"&gt;SMSD&lt;/a&gt; where graph-theoretical tools are used to generate possible subgraph matches, and then a chemical filter is used to rank the results. Ranking and filtering are not quite the same, so perhaps there should be a ChemicalRanker interface? It would be a little like an Enumeration, except that it might not be a total ordering, but a partial order.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-8646072032027285013?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/8646072032027285013/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=8646072032027285013' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/8646072032027285013'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/8646072032027285013'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/08/combinations-and-filters.html' title='Combinations and Filters'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_eL4nhrOF5R4/TGhBwJ6KqmI/AAAAAAAAAWg/aaOmIDXYINU/s72-c/double_bond_systems.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-8669452758473052881</id><published>2010-08-14T16:23:00.003+01:00</published><updated>2010-08-14T16:57:01.631+01:00</updated><title type='text'>1,4-Benzoquinone and the DeduceBondSystemsTool</title><content type='html'>Once upon a time, there was a DeduceBondSystemsTool, and...&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Er, anyway. Further to a patch made on the tool (patch ID : 3040138), there is a failing test for 1,4-benzoquinone:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_eL4nhrOF5R4/TGa2Y1yMJoI/AAAAAAAAAWY/mtXh5U63xBE/s1600/bzq_alternatives.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 305px;" src="http://4.bp.blogspot.com/_eL4nhrOF5R4/TGa2Y1yMJoI/AAAAAAAAAWY/mtXh5U63xBE/s320/bzq_alternatives.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5505288132377716354" /&gt;&lt;/a&gt;The tool generates A, and the test wants B. Now, the problem is not that the tool is not trying B as a possibility, but that it generates A &lt;i&gt;first&lt;/i&gt; and the final step doesn't remove it or rank it as better than A.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Understanding this requires an understanding of the algorithm. This is (roughly):&lt;ol&gt;&lt;li&gt;For each ring, generate a list of possible positions for all numbers of double bonds.&lt;/li&gt;&lt;li&gt;Generate a set of molecules by combining these positions together.&lt;/li&gt;&lt;li&gt;Remove 'bad' solutions and pick a solution with the least number of 'bad' N/S atoms.&lt;/li&gt;&lt;/ol&gt;where the definition of  'bad' is based on chemical rules like atom types.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Now, neither A nor B are bad solutions, and they don't contain N or S atoms, so they both have a rank of zero, and the first one generated will be returned. So, there is really no particular reason that the test should pass.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In general, it might be good to separate the generation of possible solutions from the ranking/filtering process. So that the computational or mathematical problem of generation is done by one class, while other classes determine which is the optimal solution (or set of solutions).&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-8669452758473052881?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/8669452758473052881/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=8669452758473052881' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/8669452758473052881'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/8669452758473052881'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/08/14-benzoquinone-and-deducebondsystemsto.html' title='1,4-Benzoquinone and the DeduceBondSystemsTool'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_eL4nhrOF5R4/TGa2Y1yMJoI/AAAAAAAAAWY/mtXh5U63xBE/s72-c/bzq_alternatives.png' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-5809820123897784967</id><published>2010-08-09T22:31:00.006+01:00</published><updated>2010-08-09T23:11:29.109+01:00</updated><title type='text'>Line Graphs and Double Bonding Systems</title><content type='html'>After looking at a CDK tool for fixing bond orders for aromatic systems (DeduceBondSystemTool in the smiles package) I wondered if there was a more general approach. That is, the problem is to take a molecular graph with no double bonds and generate all possible double bonded systems. &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;One possibility might be to first convert the graph (G) into a form known as a &lt;a href="http://en.wikipedia.org/wiki/Line_graph"&gt;line graph&lt;/a&gt; (lg(G)) where every vertex in lg(G) is an edge in G. If these vertices are labelled to represent the bond order, then an aromatic system has a particular line graph. For example, here is benzene:&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_eL4nhrOF5R4/TGB63JMtkBI/AAAAAAAAAWI/nI86Q-exdZ0/s1600/benzene.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 232px;" src="http://4.bp.blogspot.com/_eL4nhrOF5R4/TGB63JMtkBI/AAAAAAAAAWI/nI86Q-exdZ0/s320/benzene.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5503533832427376658" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The dashed lines show the construction of the line graph, and the labels '-' and '=' mean single and double. Now obviously, the two resulting graphs are essentially the same, so it would be nice to remove this redundancy. An example of two different bonding systems comes from phenanthrene:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_eL4nhrOF5R4/TGB6spzTreI/AAAAAAAAAWA/S2QSixAyJJM/s1600/phenanthrene.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 210px;" src="http://4.bp.blogspot.com/_eL4nhrOF5R4/TGB6spzTreI/AAAAAAAAAWA/S2QSixAyJJM/s320/phenanthrene.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5503533652200631778" /&gt;&lt;/a&gt;Which is great, but how to generate all non-redundant colorings of the line graphs? Since a line graph is just a graph, it can have a signature, and a signature quotient graph. This can then be colored:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/TGB7prYg0mI/AAAAAAAAAWQ/QLZACHpiJlE/s1600/quotient_line_graphs.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 218px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/TGB7prYg0mI/AAAAAAAAAWQ/QLZACHpiJlE/s320/quotient_line_graphs.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5503534700597138018" /&gt;&lt;/a&gt;However, in this example it is necessary to 'half-color' some of the vertices of the quotient graph ... which doesn't quite seem to work. The numbers in between the colored quotient graphs show how many line graph vertices are in each of the symmetry classes.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In any case, this is a bit of a toy problem, with only a partial solution, but &lt;a href="http://github.com/gilleain/doublebonds"&gt;here is a code repository&lt;/a&gt; for a sketch of the code. Note that the algorithm is missing! &lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-5809820123897784967?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/5809820123897784967/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=5809820123897784967' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/5809820123897784967'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/5809820123897784967'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/08/line-graphs-and-double-bonding-systems.html' title='Line Graphs and Double Bonding Systems'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_eL4nhrOF5R4/TGB63JMtkBI/AAAAAAAAAWI/nI86Q-exdZ0/s72-c/benzene.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-5333600088822590907</id><published>2010-07-28T15:35:00.002+01:00</published><updated>2010-07-28T15:37:20.093+01:00</updated><title type='text'>CDK Export</title><content type='html'>A short post just to get a picture in place. Someone else's code, my diagram..&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/TFBAf5HvgZI/AAAAAAAAAVo/CF2acPMwLs8/s1600/cdk_export.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 199px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/TFBAf5HvgZI/AAAAAAAAAVo/CF2acPMwLs8/s320/cdk_export.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5498966061672792466" /&gt;&lt;/a&gt;Click for bigger, as usual.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-5333600088822590907?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/5333600088822590907/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=5333600088822590907' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/5333600088822590907'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/5333600088822590907'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/07/cdk-export.html' title='CDK Export'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_eL4nhrOF5R4/TFBAf5HvgZI/AAAAAAAAAVo/CF2acPMwLs8/s72-c/cdk_export.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-897527504352784858</id><published>2010-06-29T13:22:00.003+01:00</published><updated>2010-06-29T13:30:42.319+01:00</updated><title type='text'>Formula Debugger</title><content type='html'>&lt;div&gt;(Please click for bigger - there is quite a bit of detail :)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/TCnl0KEfdeI/AAAAAAAAAVg/12jMDkLY6mM/s1600/formula_debugger.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 211px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/TCnl0KEfdeI/AAAAAAAAAVg/12jMDkLY6mM/s320/formula_debugger.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5488170305146811874" /&gt;&lt;/a&gt;&lt;br /&gt;The left panel shows the search tree of structures; each circle is a structure, the red ones at the tips of some branches are fully connected and saturated. The blue branch has been selected, and is shown in the middle, with the structure at the tip at the top. &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This solution is also highlighted (in red - colors are not very consistent) and shown in detail on the right. The right/middle panel is a traditional molecule layout, but with numbers in place of atom symbols. The lower right panel is the spanning tree of the atom highlighted in red in the upper right panel.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Clearly, there are still a lot of unnecessary structures generated since most branches are dead ends. However, this version does at least avoid duplicates - sadly it also misses a few structures :( Clearly refining partitions based on element symbols doesn't totally work...&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-897527504352784858?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/897527504352784858/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=897527504352784858' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/897527504352784858'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/897527504352784858'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/06/formula-debugger.html' title='Formula Debugger'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_eL4nhrOF5R4/TCnl0KEfdeI/AAAAAAAAAVg/12jMDkLY6mM/s72-c/formula_debugger.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-2028274590095792957</id><published>2010-06-20T18:28:00.002+01:00</published><updated>2010-06-20T21:57:57.212+01:00</updated><title type='text'>CDK Signature implementation now in review</title><content type='html'>&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:large;"&gt;&lt;b&gt;What is it?&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-size:large;"&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;I have written about them quite a bit, but here is a quick summary : signatures are a little like &lt;a href="http://en.wikipedia.org/wiki/SMILES"&gt;SMILES&lt;/a&gt;, but also somewhat like &lt;a href="http://dx.doi.org/10.1016/S0003-2670(01)83100-7"&gt;HOSE codes&lt;/a&gt;. They are a description of the connectivity of a molecule, or an atom in the molecule. A more detailed description can be found in these papers by Faulon &lt;i&gt;et al&lt;/i&gt;: &lt;a href="http://pubs.acs.org/doi/abs/10.1021/ci020345w"&gt;[1]&lt;/a&gt;, &lt;a href="http://pubs.acs.org/doi/abs/10.1021/ci0341823"&gt;[3]&lt;/a&gt; or in &lt;a href="http://gilleain.blogspot.com/2009/06/faulons-signatures-possible.html"&gt;this blog post&lt;/a&gt; by me.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The java implementation of this algorithm is a collaboration between &lt;a href="http://se.linkedin.com/pub/lars-carlsson/9/641/203"&gt;Lars Carlsson&lt;/a&gt; (who wrote a C++ version) and me (who ported this version to java). However, I was also influenced by my previous attempt at a port from the c implementation by &lt;a href="http://www.epigenomique.genopole.fr/~faulon/"&gt;Faulon's group&lt;/a&gt;. There is an online service for using their program called "sscan" &lt;a href="http://www.epigenomique.genopole.fr/~faulon/sscan.php"&gt;here&lt;/a&gt;. It also deals with stereochemistry.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style=" ;font-size:large;"&gt;&lt;b&gt;What is it used for?&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style=" ;font-size:large;"&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;So, what can be done with all this new code? Here are some possibilities:&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;Smiles-like canonical strings that represent molecules. Note that signatures are considerably longer than smiles, but are guaranteed to work for &lt;a href="http://gilleain.blogspot.com/2009/06/cuneane.html"&gt;cuneane&lt;/a&gt;, and indeed a broad &lt;a href="http://github.com/gilleain/signatures/blob/master/src/test/java/signature/simple/SimpleQuotientGraphTest.java"&gt;range of graphs&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;As with HOSE-codes (which can describe molecule connectivity up to different 'spheres') signatures can vary in height. Practically, this means an atom's environment can be described with different levels of detail.&lt;/li&gt;&lt;li&gt;Due to the canonisation of the structure, the core algorithm can be used to give a canonical labelling of the structure, which can be useful for atom-atom mapping of isomorphic structures.&lt;/li&gt;&lt;li&gt;Calculating signatures for all atoms of a molecule produces a &lt;a href="http://en.wikipedia.org/wiki/Partition_of_a_set"&gt;partition&lt;/a&gt; of the atoms into sets of &lt;a href="http://en.wikipedia.org/wiki/Equivalence_class"&gt;equivalent positions&lt;/a&gt;. This is useful for a variety of analyses of a molecule's graph structure.&lt;/li&gt;&lt;/ul&gt;Obviously, there are advantages and disadvantages of using signatures for these applications, compared to existing techniques. I am not sure, for example, of speed differences in using signatures to get a canonical representation.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style=" ;font-size:large;"&gt;&lt;b&gt;How do you use it?&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style=" ;font-size:large;"&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span&gt;The &lt;tt&gt;MoleculeSignature&lt;/tt&gt; class is a wrapper around an instance of an &lt;tt&gt;IMolecule&lt;/tt&gt; and provides several useful methods, many of them from the base class &lt;tt&gt;AbstractGraphSignature&lt;/tt&gt;. For example:&lt;/span&gt;&lt;/div&gt;&lt;pre&gt;IMolecule thiazole = MoleculeFactory.makeThiazole();&lt;br /&gt;MoleculeSignature moleculeSignature = new MoleculeSignature(thiazole);&lt;br /&gt;System.out.println(moleculeSignature.toCanonicalString());&lt;br /&gt;// Result = "[C](=[C]([N](=[C,0]))[S]([C,0]))"&lt;/pre&gt;&lt;br /&gt;This is the canonical signature for the whole molecule. To get this, canonical signatures are made for each atom, and the canonical one from the list is returned. To get all the signatures - rather, the equivalance classes (or 'orbits') - use the &lt;tt&gt;calculateOrbits&lt;/tt&gt; method like this:&lt;pre&gt;&lt;br /&gt;MoleculeSignature moleculeSignature = new MoleculeSignature(MoleculeFactory.makeQuinone());&lt;br /&gt;for (Orbit orbit : moleculeSignature.calculateOrbits()) {&lt;br /&gt;System.out.println(orbit);&lt;br /&gt;}&lt;/pre&gt; which gives this output (the 'makeQuinone' method makes &lt;a href="http://en.wikipedia.org/wiki/1,4-Benzoquinone"&gt;1,4-benzoquinone&lt;/a&gt;:&lt;pre&gt;&lt;br /&gt;[O](=[C]([C](=[C]([C,0](=[O])))[C](=[C]([C,0])))) [0, 7]&lt;br /&gt;[C]([C](=[C]([C,0](=[O])))[C](=[C]([C,0]))=[O]) [1, 4]&lt;br /&gt;[C](=[C]([C]([C,0]=[O]))[C]([C](=[C,0])=[O])) [2, 3, 5, 6]&lt;/pre&gt;which tells us that the two oxygen atoms ([0, 7]) are in the same orbit, as are the carbons attached to them, and that the other four are in another orbit. I have written about more complex examples of orbits : in &lt;a href="http://gilleain.blogspot.com/2010/05/c60-double-bonding-networks.html"&gt;C60&lt;/a&gt; or in &lt;a href="http://gilleain.blogspot.com/2010/05/fullerene-symmetries.html"&gt;other fullerenes&lt;/a&gt; or in some &lt;a href="http://gilleain.blogspot.com/2009/07/adamantane-diamantane-twistane.html"&gt;other regular graphs&lt;/a&gt;. In practice, most chemicals will have automorphism partitions that are (nearly) discrete.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So, finally, an example of how to get the canonical labelling of a graph:&lt;/div&gt;&lt;pre&gt;MoleculeSignature moleculeSignature =&lt;br /&gt;        new MoleculeSignature(MoleculeFactory.makeCyclobutadiene());&lt;br /&gt;System.out.println(Arrays.toString(moleculeSignature.getCanonicalLabels()));&lt;/pre&gt;which gives "[0, 3, 2, 1]" - essentially this is the permutation which gives a canonical arrangement of atoms.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Non-CDK implementations?&lt;/b&gt;&lt;/div&gt;&lt;br /&gt;There are other chemistry projects other than the CDK, and it should be fairly easy to make a mychemlib.MoleculeSignature by subclassing signature.AbstractGraphSignature (and similarly for AtomSignature/AbstractVertexSignature). All the concrete classes need do is tell its superclass about the underlying molecule graph - getVertexCount, getConnected - and the MoleculeSignature has to act as a factory for the concrete AtomSignature instances via getSignatureForVertex.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The signature project &lt;a href="http://github.com/gilleain/signatures"&gt;is on github&lt;/a&gt; and has some of the maven machinery for building/testing/packaging. There are a couple of 'toy' implementations for chemicals and simple (mathematical) graphs.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Any feedback, suggestions, and so on are welcome. I am also happy to help with other people's implementations in the form of code or just hints. Enjoy!&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-2028274590095792957?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/2028274590095792957/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=2028274590095792957' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2028274590095792957'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2028274590095792957'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/06/cdk-signature-implementation-now-in.html' title='CDK Signature implementation now in review'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-5999473241322590206</id><published>2010-05-16T13:43:00.003+01:00</published><updated>2010-05-16T13:55:26.524+01:00</updated><title type='text'>Non-chemical example : Grinberg's graph</title><content type='html'>While looking for an algorithm to lay out fullerenes as 2D graphs, I came across &lt;a href="http://www-lp.fmf.uni-lj.si/plestenjak/Papers/schlegel.pdf"&gt;this paper&lt;/a&gt; (PDF). It describes an annealing/spring layout method for &lt;a href="http://en.wikipedia.org/wiki/Schlegel_diagram"&gt;Schlegel diagram&lt;/a&gt;s. I don't think I can spare the time to implement it at the moment, but one of the graphs in the paper is this:&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/S-_q3Uon1zI/AAAAAAAAAVY/rI5sI9C8adg/s1600/grinberg_sigs.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 306px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/S-_q3Uon1zI/AAAAAAAAAVY/rI5sI9C8adg/s320/grinberg_sigs.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5471850308431238962" /&gt;&lt;/a&gt;known as &lt;a href="http://mathworld.wolfram.com/GrinbergGraphs.html"&gt;the Grinberg graph&lt;/a&gt;.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-5999473241322590206?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/5999473241322590206/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=5999473241322590206' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/5999473241322590206'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/5999473241322590206'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/05/non-chemical-example-grinbergs-graph.html' title='Non-chemical example : Grinberg&apos;s graph'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_eL4nhrOF5R4/S-_q3Uon1zI/AAAAAAAAAVY/rI5sI9C8adg/s72-c/grinberg_sigs.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-7675199938603205531</id><published>2010-05-14T16:56:00.005+01:00</published><updated>2010-05-14T17:09:36.570+01:00</updated><title type='text'>C60 double bonding networks</title><content type='html'>C60 (or buckminsterfullerene) has  no hydrogens, so it must have quite a few double bonds. I am beginning to understand that bond orders are in some sense a simplification, and I suppose that the bonding is in some way delocalised across the sphere. However, here is a picture of two different bonding patterns:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/S-1zngLLpJI/AAAAAAAAAVA/u75kjHJ62GQ/s1600/c60_double_bonds.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 176px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/S-1zngLLpJI/AAAAAAAAAVA/u75kjHJ62GQ/s320/c60_double_bonds.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5471156244813948050" /&gt;&lt;/a&gt;&lt;br /&gt;(click for bigger, as usual). The 'ChEBI' bonding pattern on the left is from the molfile in &lt;a href="http://www.ebi.ac.uk/chebi/searchId.do?chebiId=33128"&gt;a ChEBI entry&lt;/a&gt; while the 'radial' bonding one on the right is bonded according to schemes &lt;a href="http://modularity.tripod.com/ful.htm"&gt;from this site&lt;/a&gt; which has an interesting graph-theory perspective on fullerenes.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The radial bonding version has a simpler, layered structure like this:&lt;/div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/S-11FTXKPoI/AAAAAAAAAVI/lCXxq3dCksQ/s1600/radial_bonded_c60_slices.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 275px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/S-11FTXKPoI/AAAAAAAAAVI/lCXxq3dCksQ/s320/radial_bonded_c60_slices.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5471157856282230402" /&gt;&lt;/a&gt;&lt;br /&gt;Ok, so that's a slightly comical picture of the slices. What is also nice is the quotient graph for the ChEBI structure:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/S-11YKuD3KI/AAAAAAAAAVQ/XtwesgL1mTY/s1600/bucky_quotient.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 273px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/S-11YKuD3KI/AAAAAAAAAVQ/XtwesgL1mTY/s320/bucky_quotient.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5471158180379876514" /&gt;&lt;/a&gt;&lt;br /&gt;I didn't color this, but what is great is that it looks like a subgraph of a fullerene! Apart from the loops along the top and bottom, of course.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-7675199938603205531?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/7675199938603205531/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=7675199938603205531' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/7675199938603205531'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/7675199938603205531'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/05/c60-double-bonding-networks.html' title='C60 double bonding networks'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_eL4nhrOF5R4/S-1zngLLpJI/AAAAAAAAAVA/u75kjHJ62GQ/s72-c/c60_double_bonds.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-9035995523160417902</id><published>2010-05-13T22:08:00.005+01:00</published><updated>2010-05-13T22:29:50.920+01:00</updated><title type='text'>Fullerene symmetries</title><content type='html'>Continuing the theme of colored graphs, some of the more interesting examples are fused ring structures, especially those with some symmetries, but not completely symmetrical. Fullerenes fit this description, for example this 26 atom example:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/S-xrHkl-LCI/AAAAAAAAAUo/DW-WWV7apEM/s1600/fullerene26.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 231px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/S-xrHkl-LCI/AAAAAAAAAUo/DW-WWV7apEM/s320/fullerene26.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5470865425174506530" /&gt;&lt;/a&gt;&lt;br /&gt;the distribution of colors might look a little odd, but the dark blue atom surrounded by three cyan atoms is actually repeated at the top - which is really the other side of the sphere. These kind of 3D molecules don't lay out very well with the CDK's layout code, so I used JChempaint instead to make a more symmetric 28-fullerene:&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_eL4nhrOF5R4/S-xsVnrTwgI/AAAAAAAAAUw/wr1WQyF6ipM/s1600/signature_viewer_grab.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 165px;" src="http://4.bp.blogspot.com/_eL4nhrOF5R4/S-xsVnrTwgI/AAAAAAAAAUw/wr1WQyF6ipM/s320/signature_viewer_grab.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5470866766031995394" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;This is actually a screengrab of a crude viewer I put together (&lt;a href="http://github.com/gilleain/cdk_signature/commit/e09ff2ec4a9470ad9f58259f269e5c79f07f7dbe"&gt;commit&lt;/a&gt;) that takes a molfile and calculates the signatures. Selecting a signature from the right hand list highlights it on the graph. Anyway, it's easier than making images like this by hand:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/S-xt8L7LvNI/AAAAAAAAAU4/THUbJWXF8dM/s1600/dodecahedrane_alternate_sigs.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 234px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/S-xt8L7LvNI/AAAAAAAAAU4/THUbJWXF8dM/s320/dodecahedrane_alternate_sigs.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5470868528108911826" /&gt;&lt;/a&gt;Except that the little graph at the bottom (which is a &lt;a href="http://github.com/gilleain/cdk/blob/fe8cd6cb5b760f1478736147df0c9252141ae0b0/src/main/org/openscience/cdk/signature/SignatureQuotientGraph.java"&gt;quotient graph&lt;/a&gt;) can't be drawn by the renderer as it has loop-edges.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-9035995523160417902?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/9035995523160417902/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=9035995523160417902' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/9035995523160417902'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/9035995523160417902'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/05/fullerene-symmetries.html' title='Fullerene symmetries'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_eL4nhrOF5R4/S-xrHkl-LCI/AAAAAAAAAUo/DW-WWV7apEM/s72-c/fullerene26.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-2332170033653364565</id><published>2010-05-13T17:04:00.004+01:00</published><updated>2010-05-13T17:19:04.537+01:00</updated><title type='text'>Chemicals as colored graphs</title><content type='html'>The interface between maths and chemistry can be tricky when it comes to terminology - sets (maths) have elements, chemistry has a different kind of element; graphs have colors which are usually just numbers, diagrams of chemicals have colors which usually relate to the element type of the atom, and so on.&lt;br /&gt;&lt;br /&gt;So, for maximum confusion, here are two pictures of graphs (that could represent chemical connectivity) colored by equivalence class (determined by signature). The signature trees are also drawn with graphical colors, but these represent the integer colors in the signature, which are not the same as the colors used to indicate equivalence class. Firstly, a structure that the smiles algorithm is meant to have trouble with (but may not exist):&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/S-wlj88gP3I/AAAAAAAAAUY/0Vwc-W6p-Ec/s1600/dispirocyclooctane.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 282px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/S-wlj88gP3I/AAAAAAAAAUY/0Vwc-W6p-Ec/s320/dispirocyclooctane.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5470788946933858162" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;It looks quite strained, so I expect that it may not be possible to synthesise. Another multi-ring system is this one:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_eL4nhrOF5R4/S-wl-Uos_iI/AAAAAAAAAUg/XMO8pMLdzOM/s1600/bowtieane_sigs.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 215px;" src="http://4.bp.blogspot.com/_eL4nhrOF5R4/S-wl-Uos_iI/AAAAAAAAAUg/XMO8pMLdzOM/s320/bowtieane_sigs.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5470789399969857058" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I don't even know what this one would be called, even if it did exist. Annoyingly, this structure triggers a bug if the two dark blue atoms are connected. This makes the graph 3-regular, but the yellow equivalence class is split, which shouldn't happen.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-2332170033653364565?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/2332170033653364565/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=2332170033653364565' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2332170033653364565'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2332170033653364565'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/05/chemicals-as-colored-graphs.html' title='Chemicals as colored graphs'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_eL4nhrOF5R4/S-wlj88gP3I/AAAAAAAAAUY/0Vwc-W6p-Ec/s72-c/dispirocyclooctane.png' height='72' width='72'/><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-599162294873101102</id><published>2010-05-12T14:45:00.004+01:00</published><updated>2010-05-12T14:53:23.319+01:00</updated><title type='text'>Orienting a Pyrene diagram</title><content type='html'>Another example from Symmmetry in Chemistry, of drawing pyrene:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/S-qxVdVzf0I/AAAAAAAAAUI/TLWvKDdIP2E/s1600/pyrene.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 201px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/S-qxVdVzf0I/AAAAAAAAAUI/TLWvKDdIP2E/s320/pyrene.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5470379679606341442" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;the standard (IUPAC?) orientation on the left doesn't show the symmetries of the molecule as well as the rotated version on the right. The letters indicate equivalent atom positions - oh, and aromatic indicators are missing :)&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It seems like there should be a way to discover symmetry axis from the graph - without coordinates. In a similar way that some ring perception algorithms work by reducing the graph:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/S-qyJfot0iI/AAAAAAAAAUQ/BqWi_YUzE4k/s1600/pyrene_fragments.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 202px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/S-qyJfot0iI/AAAAAAAAAUQ/BqWi_YUzE4k/s320/pyrene_fragments.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5470380573575729698" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;However, it is not obvious that this would be better than laying out the structure, then using the 2D coordinates to determine symmetries. Also not clear to me is how to choose which vertices to merge, and which to duplicate. On the left hand side of the image above, the fragments have duplicate (a), (d), and (e) vertices, but (e) has also been merged before being duplicated.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-599162294873101102?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/599162294873101102/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=599162294873101102' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/599162294873101102'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/599162294873101102'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/05/orienting-pyrene-diagram.html' title='Orienting a Pyrene diagram'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_eL4nhrOF5R4/S-qxVdVzf0I/AAAAAAAAAUI/TLWvKDdIP2E/s72-c/pyrene.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-649611819393661221</id><published>2010-05-12T11:25:00.005+01:00</published><updated>2010-05-12T12:01:23.486+01:00</updated><title type='text'>1,2-dichlorocyclopropane and a spiran</title><content type='html'>As I am reading a book called "Symmetry in Chemistry" (H. H. Jaffé and M. Orchin) I thought I would try out a couple of examples that they use. One is &lt;a href="http://www.chemspider.com/Chemical-Structure.10644328.html"&gt;1,2-dichlorocylopropane&lt;/a&gt; :&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/S-qG6c4QvtI/AAAAAAAAATw/szqrOhBixgo/s1600/dichlorocyclopropane.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 188px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/S-qG6c4QvtI/AAAAAAAAATw/szqrOhBixgo/s320/dichlorocyclopropane.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5470333036137594578" /&gt;&lt;/a&gt;which is, apparently, &lt;i&gt;dissymmetric &lt;/i&gt;because it has a symmetry element (a C2 axis) but is optically active. Incidentally, wedges can look horrible in small structures - this is why:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/S-qHqz53iNI/AAAAAAAAAT4/8jYfE3AacFA/s1600/wedge_text_overlap.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 217px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/S-qHqz53iNI/AAAAAAAAAT4/8jYfE3AacFA/s320/wedge_text_overlap.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5470333866952067282" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The box around the hydrogen is shaded in grey, to show the effect of overlap. A possible fix might be to shorten the wedge, but sadly this would require working out the bounds of the text when calculating the wedge, which has to be done at render time. Oh well.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Another interesting example is this 'spiran', which I can't find on ChEBI or ChemSpider:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/S-qJcqEn27I/AAAAAAAAAUA/U74jgFVXs-k/s1600/spiran.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 196px; height: 280px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/S-qJcqEn27I/AAAAAAAAAUA/U74jgFVXs-k/s320/spiran.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5470335822817909682" /&gt;&lt;/a&gt;Image again courtesy of &lt;a href="http://sourceforge.net/apps/mediawiki/cdk/index.php?title=JChemPaint"&gt;JChempaint&lt;/a&gt;. I guess the problem marker (the red line) on the N suggests that it is not a real compound? In any case, &lt;a href="http://github.com/gilleain/signatures/commit/14196cb2c2315c425c8acd2dabe8997cc82aa03c"&gt;some simple code&lt;/a&gt; to determine potential chiral centres (using signatures) finds 2 in the cyclopropane structure, and 4 in the spiran. Since the code is not using a 3D structure, only a connected graph, it can't work out the spiran's S4 axis.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-649611819393661221?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/649611819393661221/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=649611819393661221' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/649611819393661221'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/649611819393661221'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/05/12-dichlorocyclopropane-and-spiran.html' title='1,2-dichlorocyclopropane and a spiran'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_eL4nhrOF5R4/S-qG6c4QvtI/AAAAAAAAATw/szqrOhBixgo/s72-c/dichlorocyclopropane.png' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-2562243276116026621</id><published>2010-05-11T21:29:00.004+01:00</published><updated>2010-05-11T23:47:23.758+01:00</updated><title type='text'>Stuck : Detailed Description</title><content type='html'>&lt;span class="Apple-style-span" style="  white-space: pre-wrap; "&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;Ok, so this is the detailed version of the previous post.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style=" white-space: pre-wrap;font-size:small;"&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="  white-space: pre-wrap; "&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;To recap; the structure generation code is still missing a vital piece - the canonical checking. I have been implementing Jean-Loup Faulon's algorithm for generation, but there is no precise algorithm given for canonical checking. Here the relevant paragraph from the enumeration paper:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style=" white-space: pre-wrap;font-size:small;"&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="  white-space: pre-wrap; "&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;&lt;blockquote style="text-align: justify;"&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;"Checking for canonicity is a common procedure of orderly enumeration algorithms, the procedure guarantees that the graphs generated are nonisomorphic. ...To verify that a graph is canonical, one labels the vertices of the graph in all possible ways. The graph is canonical if the initial labeling leads to a list of edges that is lexicographically smaller than the lists obtained with all other labelings. In the present paper, we have implemented two algorithms to verify canonicity, Tarjan tree canonization algorithm if the tested graph is acyclic and McKay’s Nauty technique otherwise"&lt;/span&gt;&lt;/blockquote&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="  white-space: pre-wrap; "&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;Okay, so I don't understand this for a couple of reasons : firstly, 'labelling all possible ways' sounds like an n-factorial (n!) operation; secondly, n&lt;a href="http://cs.anu.edu.au/~bdm/nauty/"&gt;auty&lt;/a&gt; does not lexicographically compare edge strings (as far as I know). Sadly, I know that I have misunderstood something, since my code doesn't work, and theirs does. :(&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:arial, serif;"&gt;&lt;span class="Apple-style-span" style="white-space: pre-wrap;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:arial, serif;"&gt;&lt;span class="Apple-style-span" style="white-space: pre-wrap;"&gt;The technique used by nauty is also used by &lt;a href="http://www.tcs.hut.fi/Software/bliss/index.html"&gt;bliss&lt;/a&gt; and &lt;a href="http://www.tcs.hut.fi/Software/bliss/index.html"&gt;saucy&lt;/a&gt; and is known as iterative refinement of partitions. A 'partition' is a division of a set into subsets called 'cells', and refinement of a partition roughly means making a finer partition that has cells at least as small, if not smaller that the original. The end of the refinement process are 'discrete' partitions that are the same as permutations, since each cell has only one member. This is sketched in the image below:&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:arial, serif;"&gt;&lt;span class="Apple-style-span" style="white-space: pre-wrap;"&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/S-nLPnKMsPI/AAAAAAAAATg/wokmtjm18BI/s1600/iterative_refinement_tree.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 223px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/S-nLPnKMsPI/AAAAAAAAATg/wokmtjm18BI/s320/iterative_refinement_tree.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5470126691488411890" /&gt;&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:arial, serif;"&gt;&lt;span class="Apple-style-span" style="white-space: pre-wrap;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:arial, serif;"&gt;&lt;span class="Apple-style-span" style="white-space: pre-wrap;"&gt;A simple implementation of a partition refiner &lt;a href="http://github.com/gilleain/cdk_signature/blob/master/src/org/openscience/cdk/group/AbstractDiscretePartitionRefiner.java"&gt;is in my repository&lt;/a&gt;. It is a slightly modified version of the algorithm from a book I've mentioned before (CAGES). It tries to deal with vertex and edge colours, although I don't think that it does so all that well. However, for simple graphs, it does indeed produce the automorphism group, as well as check for a canonical graph.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:arial, serif;"&gt;&lt;span class="Apple-style-span" style="white-space: pre-wrap;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:arial, serif;"&gt;&lt;span class="Apple-style-span" style="white-space: pre-wrap;"&gt;For example, the simple 5-cycle graph in the image might be canonical (or 'canonically labelled') if the refinement process produces the identity permutation as the first discrete partition. This will depend on the particular cell selection algorithm, and the choice made of how to arrange newly split cells. &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:arial, serif;"&gt;&lt;span class="Apple-style-span" style="white-space: pre-wrap;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:arial, serif;"&gt;&lt;span class="Apple-style-span" style="white-space: pre-wrap;"&gt;The modifications I made to the CAGES refiner were done to align the canonical checking with the graph enumeration process. This is important, as edges are added to a graph in a characteristic order, and this has to produce graphs that will be accepted as canonical. This is illustrated in the following image:&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:arial, serif;"&gt;&lt;span class="Apple-style-span" style="white-space: pre-wrap;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:arial, serif;"&gt;&lt;span class="Apple-style-span" style="white-space: pre-wrap;"&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/S-nQfZLJerI/AAAAAAAAATo/z350Ql_RVkk/s1600/canonical_labelling_schemes.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 220px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/S-nQfZLJerI/AAAAAAAAATo/z350Ql_RVkk/s320/canonical_labelling_schemes.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5470132460170345138" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:arial, serif;"&gt;&lt;span class="Apple-style-span" style="white-space: pre-wrap;"&gt;The cycle graphs are laid out in a standard way on the left, and in a linear view on the right. The linear view emphasises the order in which the edges might be added. In the canonical scheme I have chosen, vertices are connected to the next possible partner. To put it another way, the resulting graphs will have a minimal edge 'length' when laid out as a linear graph.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:arial, serif;"&gt;&lt;span class="Apple-style-span" style="white-space: pre-wrap;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:arial, serif;"&gt;&lt;span class="Apple-style-span" style="white-space: pre-wrap;"&gt;What the partition refinement process is doing is the same as labelling all possible ways. The leaves of the refinement tree are permutations. Assuming the refinement process gives automorphic permutations, this is almost the same as labelling within the orbits of the atoms. It seems like partitioning the atoms by signature, then refining this partition should work, but it doesn't.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:arial, serif;"&gt;&lt;span class="Apple-style-span" style="white-space: pre-wrap;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:arial, serif;"&gt;&lt;span class="Apple-style-span" style="white-space: pre-wrap;"&gt;Any better suggestions or clarifications are very welcome :) &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-2562243276116026621?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/2562243276116026621/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=2562243276116026621' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2562243276116026621'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2562243276116026621'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/05/stuck-detailed-description.html' title='Stuck : Detailed Description'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_eL4nhrOF5R4/S-nLPnKMsPI/AAAAAAAAATg/wokmtjm18BI/s72-c/iterative_refinement_tree.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-7351795330362991646</id><published>2010-05-11T19:03:00.002+01:00</published><updated>2010-05-11T19:49:14.912+01:00</updated><title type='text'>Stuck : The Summary</title><content type='html'>A couple of people have asked how the structure generation stuff is going, and the short answer is that I am stuck. This post will give a short summary of the problem, and the next will give a much more detailed description.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So the problem is this : given an elemental formula (like C6H12) or a list of fragments plus a formula (like {2 * CH, 2 * CH2, 2 * CH3 : C6H12}) return all possible connected structures.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There are simple ways to do this, such as connecting every atom to every other atom, and removing duplicates. The downside is that this takes forever, because this procedure will make many, many isomorphic copies of each solution. At the final filtering step, an all-v-all comparison would have to be done on these many copies.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A better solution is to check each structure each time a bond is made, to see if it is canonical. Although I know how to do this in theory, it turns out to be more difficult in practice. For simple graphs, I have a solution that seems to work. Chemicals are not simple graphs, however, as they have elements and bond orders.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This is not code optimisation - without checking for canonical graphs, the running times for even quite simple problems are far too long. For more realistic, reasonable problems the code would be too slow to be useful.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-7351795330362991646?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/7351795330362991646/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=7351795330362991646' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/7351795330362991646'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/7351795330362991646'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/05/stuck-summary.html' title='Stuck : The Summary'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-7623259000695099777</id><published>2010-05-09T15:21:00.003+01:00</published><updated>2010-05-09T15:22:48.488+01:00</updated><title type='text'>Debugger</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_eL4nhrOF5R4/S-bFLnedX8I/AAAAAAAAATY/EB_Oxv6RYvY/s1600/debugger.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 162px;" src="http://4.bp.blogspot.com/_eL4nhrOF5R4/S-bFLnedX8I/AAAAAAAAATY/EB_Oxv6RYvY/s320/debugger.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5469275600854015938" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-7623259000695099777?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/7623259000695099777/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=7623259000695099777' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/7623259000695099777'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/7623259000695099777'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2010/05/debugger.html' title='Debugger'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_eL4nhrOF5R4/S-bFLnedX8I/AAAAAAAAATY/EB_Oxv6RYvY/s72-c/debugger.png' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-3034860287328319478</id><published>2009-07-30T10:25:00.004+01:00</published><updated>2009-07-30T10:40:03.417+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='structure generation'/><category scheme='http://www.blogger.com/atom/ns#' term='signatures'/><title type='text'>Compatibility Table</title><content type='html'>With the canonical code in place (thanks to open source :) the structure generation goal is much nearer. The first thing to improve is the &lt;a href="http://gilleain.blogspot.com/2009/06/signature-bond-compatibility.html"&gt;bond compatibility&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/SnFouNxTidI/AAAAAAAAATE/yFZn49WLRTk/s1600-h/cuneane_bond_compatibility.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 168px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/SnFouNxTidI/AAAAAAAAATE/yFZn49WLRTk/s320/cuneane_bond_compatibility.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5364183774354704850" /&gt;&lt;/a&gt;&lt;br /&gt;This image shows cuneane (&lt;a href="http://gilleain.blogspot.com/2009/06/cuneane.html"&gt;again&lt;/a&gt;!) and the bond compatibility table. The table is tricky to calculate, but relatively easy to understand; there is only one bond between atoms of type A - so there is a one in the cell (A, A).&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Conversely, no atom of type C is connected to an atom of type A (&lt;a href="http://gilleain.blogspot.com/2009/07/warning-abstraction.html"&gt;see this&lt;/a&gt; for more detail), so there is a O in both (C, A) and (A, C). Note that the table is not symmetric, as can be seen with (A, B) = 1 and (B, A) = 2. This makes sense, in that an atom of type A is connected to two of type B, and yet an atom of type B is only connected to one of type A.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-3034860287328319478?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/3034860287328319478/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=3034860287328319478' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/3034860287328319478'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/3034860287328319478'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/07/compatibility-table.html' title='Compatibility Table'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_eL4nhrOF5R4/SnFouNxTidI/AAAAAAAAATE/yFZn49WLRTk/s72-c/cuneane_bond_compatibility.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-2204810335889113765</id><published>2009-07-22T13:55:00.002+01:00</published><updated>2009-07-22T13:59:51.680+01:00</updated><title type='text'>Crude port of Faulon's C-implementation</title><content type='html'>I finally did the sensible thing and just ported over the c code. I don't particularly like doing this, as the resulting code is probably quite fragile, and unreadable  - java spoken with a c 'accent' usually is.&lt;br /&gt;&lt;br /&gt;However, it does reproduce the same signatures as the c code, so that's good. It only took 1 day to port, but 2 days to debug...&lt;br /&gt;&lt;br /&gt;Check it out here : &lt;a href="http://github.com/gilleain/generation/tree/master"&gt;http://github.com/gilleain/generation/tree/master &lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-2204810335889113765?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/2204810335889113765/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=2204810335889113765' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2204810335889113765'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2204810335889113765'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/07/crude-port-of-faulons-c-implementation.html' title='Crude port of Faulon&apos;s C-implementation'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-4967865842171331593</id><published>2009-07-09T22:21:00.004+01:00</published><updated>2009-07-09T23:04:59.468+01:00</updated><title type='text'>Nicer tree pictures</title><content type='html'>Here are some pictures of proper trees, or rather proper signatures, to celebrate getting the translator program to compile (it involved replacing some 'local' keywords with 'define's - who knew?).&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/SlZgqGhGjBI/AAAAAAAAASs/qYZf2H295JI/s1600-h/tree_grab_2.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 306px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/SlZgqGhGjBI/AAAAAAAAASs/qYZf2H295JI/s320/tree_grab_2.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5356575083224009746" /&gt;&lt;/a&gt; &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;Also, thought I would post the code for tree layout I'm using:&lt;pre class="prettyprint"&gt;&lt;br /&gt;    public int layout(Node node) {&lt;br /&gt;        node.y  = node.depth * ySep;&lt;br /&gt;        if (node.isLeaf()) {&lt;br /&gt;            leafCount += 1;&lt;br /&gt;            node.x = leafCount * xSep;&lt;br /&gt;            return node.x;&lt;br /&gt;        } else {&lt;br /&gt;            int min = 0;&lt;br /&gt;            int max = 0;&lt;br /&gt;            for (Node child : node.children) {&lt;br /&gt;                int childCenter = layout(child);&lt;br /&gt;                if (min == 0) {&lt;br /&gt;                    min = childCenter;&lt;br /&gt;                }&lt;br /&gt;                max = childCenter;&lt;br /&gt;            }&lt;br /&gt;            if (min == max) {&lt;br /&gt;                node.x = min;&lt;br /&gt;            } else {&lt;br /&gt;                node.x = min + (max - min) / 2;&lt;br /&gt;            }&lt;br /&gt;            return node.x;&lt;br /&gt;        }&lt;br /&gt;    }&lt;br /&gt;&lt;/pre&gt;&lt;div&gt;basically, it lays out the leaves first, then returns their centers. Each non-leaf node uses the min/max values of its children to position itself. If the min and max are the same, it only has one child, so it is laid above. Otherwise, it is put at the center of the range.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Simple, probably nothing new, but does the job.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;edit: Heh. Although, with large trees...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/SlZpccQAreI/AAAAAAAAAS0/SouT2D6jQ-M/s1600-h/large_tree.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 177px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/SlZpccQAreI/AAAAAAAAAS0/SouT2D6jQ-M/s320/large_tree.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5356584744144383458" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-4967865842171331593?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/4967865842171331593/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=4967865842171331593' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/4967865842171331593'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/4967865842171331593'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/07/nicer-tree-pictures.html' title='Nicer tree pictures'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_eL4nhrOF5R4/SlZgqGhGjBI/AAAAAAAAASs/qYZf2H295JI/s72-c/tree_grab_2.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-6110535736267835488</id><published>2009-07-09T12:27:00.009+01:00</published><updated>2009-07-09T18:52:58.973+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='signatures'/><title type='text'>Tree Canonization Simplified</title><content type='html'>While debugging the methods to make canonical signatures, I learned something about tree isomorphism from various sources, including &lt;a href="http://www.lsi.upc.edu/~valiente/"&gt;Prof. Valiente&lt;/a&gt;'s excellent looking book &lt;a href="http://books.google.com/books?id=NSfIWxqPlbcC"&gt;on trees&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;One way of checking isomorphism is canonisation, since two trees are only isomorphic if they have the same canonical form. For simple labelled trees, it looks like there is an almost trivial way to get a canonical string representation. Say we have two trees:&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/SlXWFYBucGI/AAAAAAAAASc/muTB2afVfQw/s1600-h/tree_grab.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 122px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/SlXWFYBucGI/AAAAAAAAASc/muTB2afVfQw/s320/tree_grab.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5356422719664451682" /&gt;&lt;/a&gt;&lt;br /&gt;The are rooted, labelled trees. So the conversion to a canonised string proceeds as follows; for each node, lexicographically sort the string form of the labels of its children, and return the concatenated string. In python this looks like:&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/SlXWtaGg9JI/AAAAAAAAASk/YWhv4rz4ZQw/s1600-h/code_grab.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 50px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/SlXWtaGg9JI/AAAAAAAAASk/YWhv4rz4ZQw/s320/code_grab.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5356423407416177810" /&gt;&lt;/a&gt;&lt;br /&gt;which...is unreadable. hmmm. Wish there was a better way to get marked-up code into blog posts. Perhaps there is one, and I don't know of it. Anyway, the point is that it is very short.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;edit : the code is...&lt;/div&gt;&lt;br /&gt;&lt;pre class="prettyprint"&gt;&lt;br /&gt;def printSorted(node):&lt;br /&gt; if len(node.children) &gt; 0:&lt;br /&gt;   childStrings = [printSorted(child) for child in node.children]&lt;br /&gt;   return node.label + "("+ "".join(sorted(childStrings)) + ")"&lt;br /&gt; else:&lt;br /&gt;   return node.label&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-6110535736267835488?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/6110535736267835488/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=6110535736267835488' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/6110535736267835488'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/6110535736267835488'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/07/tree-canonization-simplified.html' title='Tree Canonization Simplified'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_eL4nhrOF5R4/SlXWFYBucGI/AAAAAAAAASc/muTB2afVfQw/s72-c/tree_grab.png' height='72' width='72'/><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-5666155231118127875</id><published>2009-07-06T21:01:00.006+01:00</published><updated>2009-07-07T01:18:36.503+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='signatures'/><title type='text'>Square Grids, Cylinders, Spheres, and Toruses</title><content type='html'>This is straying from the point; but if any graph can be described by canonized trees made from its subgraphs, then what are the properties of very large (regular) graphs? A grid, for example?&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Starting with a square, and fusing squares together results in this situation:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/SlJ7l3LtQTI/AAAAAAAAAR0/gby73FW--L4/s1600-h/fused_rings.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 238px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/SlJ7l3LtQTI/AAAAAAAAAR0/gby73FW--L4/s320/fused_rings.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5355478797295370546" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;two and three fused squares are similar, at the height-two tree level. With four rings, a new type &lt;b&gt;c &lt;/b&gt;appears (&lt;b&gt;b&lt;/b&gt; becomes &lt;b&gt;b' &lt;/b&gt;with three rings). Beyond this point, any number of rings fused in a row like this has the same structure.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Moving to a second dimension of growth, a square grid (G&lt;sub&gt;nn&lt;/sub&gt;) has the following structure:&lt;/div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/SlKFmPc88RI/AAAAAAAAASE/mhsGh1W0KPA/s1600-h/grid_sigs.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 230px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/SlKFmPc88RI/AAAAAAAAASE/mhsGh1W0KPA/s320/grid_sigs.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5355489798926430482" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;div&gt;On the left are 'snapshots' of the advancing wave of the tree. Looks a lot like a breadth-first search, I suppose. On the right are the trees for each snapshot, with increasing heights. Although this is not shown, 8 of the 20 leaves of the third tree are duplicates. This should be clear from the third snapshot, which has only 12 (filled) circles on the 'wavefront'.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;These trees can even be extended to grids wrapped around three-dimensional objects. Or, to put it another way, grids wrapped up as surfaces:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/SlKToarVhgI/AAAAAAAAASM/RIbUMwWG99w/s1600-h/surfaces.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 222px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/SlKToarVhgI/AAAAAAAAASM/RIbUMwWG99w/s320/surfaces.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5355505229462078978" /&gt;&lt;/a&gt;The diamond shapes with lines radiating out are the expanding trees. To the left of each surface is a very rough sketch of what the spanning tree would be like. The dashed lines indicate cutpoints where the wavefronts meet - these are duplicated on the trees. This is similar to the idea of 'gluing' surfaces in topology.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-5666155231118127875?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/5666155231118127875/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=5666155231118127875' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/5666155231118127875'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/5666155231118127875'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/07/square-grids-cylinders-spheres-and.html' title='Square Grids, Cylinders, Spheres, and Toruses'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_eL4nhrOF5R4/SlJ7l3LtQTI/AAAAAAAAAR0/gby73FW--L4/s72-c/fused_rings.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-7292402582987889646</id><published>2009-07-02T00:11:00.003+01:00</published><updated>2009-07-02T00:20:11.790+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='signatures'/><title type='text'>Warning : Abstraction!</title><content type='html'>This is a throwaway mathematical point, that I am not qualified to make, but it looks like three of the previous examples (diamantane, twistane, and cuneane) have a very abstract connection when colored by signature:&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/SkvtiVjblQI/AAAAAAAAARs/vqozicwShqE/s1600-h/class_connections.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 249px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/SkvtiVjblQI/AAAAAAAAARs/vqozicwShqE/s320/class_connections.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5353633756217054466" /&gt;&lt;/a&gt;what I mean by this diagram is that diamantane has atoms colored by (a) connected to both other (a) atoms, and to (b) atoms. Its (c) atoms are only connected to (b)s; the arrows could well be double-headed, by the way.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The most complex situation is cuneane, where each 'type' of atom is connected to another in its type and to two in another type. Adamantane would just look like : (a)-(b).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Interesting, but it doesn't get the signature canonization methods debugged any faster...&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-7292402582987889646?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/7292402582987889646/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=7292402582987889646' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/7292402582987889646'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/7292402582987889646'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/07/warning-abstraction.html' title='Warning : Abstraction!'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_eL4nhrOF5R4/SkvtiVjblQI/AAAAAAAAARs/vqozicwShqE/s72-c/class_connections.png' height='72' width='72'/><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-1108234589018858944</id><published>2009-07-01T19:36:00.003+01:00</published><updated>2009-07-01T22:40:23.799+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='signatures'/><title type='text'>Adamantane, Diamantane, Twistane</title><content type='html'>After cubane, the thought occurred to look at other regular hydrocarbons. If only there was some sort of classification of chemicals that I could use look up similar structures. &lt;a href="http://www.ebi.ac.uk/chebi/"&gt;Oh wate, there is&lt;/a&gt;.&lt;div&gt;&lt;br /&gt;&lt;div&gt;Anyway, &lt;a href="http://www.ebi.ac.uk/chebi/advancedSearchFT.do?searchString=adamantane"&gt;adamantane&lt;/a&gt; is not as regular as cubane, but it is highly symmetrical, looking like three cyclohexanes fused together. The vertices fall into two different types when colored by signature: &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/SkvIewpM0NI/AAAAAAAAAQ0/53t_ZUnOQ-Y/s1600-h/adamantane_sigs.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 228px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/SkvIewpM0NI/AAAAAAAAAQ0/53t_ZUnOQ-Y/s320/adamantane_sigs.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5353593012839305426" /&gt;&lt;/a&gt;The carbons with three carbon neighbours (degree-3, in the simple graph) have signature (a) and the degree-2 carbons have signature (b). Atoms of one type are only connected to atoms of another - the graph is &lt;a href="http://en.wikipedia.org/wiki/Bipartite_graph"&gt;bipartite&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Adamantane connects together to form &lt;a href="http://en.wikipedia.org/wiki/Diamondoid"&gt;diamondoids&lt;/a&gt; (or, rather, this class have adamantane as a repeating subunit). One such is &lt;a href="http://www.ebi.ac.uk/chebi/advancedSearchFT.do?searchString=diamantane"&gt;diamantane&lt;/a&gt;, which is no longer bipartite when colored by signature:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/SkvXpvzyxEI/AAAAAAAAARc/x2jRhhft9mo/s1600-h/diamantane_sigs.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 229px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/SkvXpvzyxEI/AAAAAAAAARc/x2jRhhft9mo/s320/diamantane_sigs.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5353609694268277826" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;It has three classes of vertex in the simple graph (a and b), as the set with degree-3 has been split in two. The tree for signature (c) is not shown. The graph is still bipartite according to the degree of each vertex.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A different interesting case is a structure with the excellent name of &lt;a href="http://www.ebi.ac.uk/chebi/advancedSearchFT.do?searchString=twistane"&gt;twistane&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_eL4nhrOF5R4/SkvIv6z3EJI/AAAAAAAAARE/oRfx9y69f04/s1600-h/twistane_sigs.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 218px;" src="http://4.bp.blogspot.com/_eL4nhrOF5R4/SkvIv6z3EJI/AAAAAAAAARE/oRfx9y69f04/s320/twistane_sigs.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5353593307626147986" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;&lt;br /&gt;This structure is also 3-colored by signatures (if that's the right terminology...) but is not bipartite with respect to degree. Or, to put it more simply, there are carbons with two carbon neigbours connected together. It seems quite clear from looking at the colored structure on the right that atoms with the same color are similar in the context of the structure.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-1108234589018858944?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/1108234589018858944/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=1108234589018858944' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/1108234589018858944'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/1108234589018858944'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/07/adamantane-diamantane-twistane.html' title='Adamantane, Diamantane, Twistane'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_eL4nhrOF5R4/SkvIewpM0NI/AAAAAAAAAQ0/53t_ZUnOQ-Y/s72-c/adamantane_sigs.png' height='72' width='72'/><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-7006258835435752540</id><published>2009-06-28T15:41:00.005+01:00</published><updated>2009-07-01T23:54:18.748+01:00</updated><title type='text'>Cuneane</title><content type='html'>While taking a look at the &lt;a href="http://www.iupac.org/inchi/release102.html"&gt;InChI&lt;/a&gt; discussion mail archives, I came&lt;a href="https://sourceforge.net/mailarchive/forum.php?thread_name=4C925D0C-5800-4B8B-B4E2-B4656EA4D984%40dalkescientific.com&amp;amp;forum_name=inchi-discuss"&gt; across a discussion &lt;/a&gt;on graphs that are difficult to canonize (so called 'isospectral' graphs - I've heard the term before, but I haven't worked with eigenvalues of adjacency matrices, so did not pay them much attention).&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Anyway, one such was this structure, &lt;a href="http://en.wikipedia.org/wiki/Cuneane"&gt;cuneane&lt;/a&gt;:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/Skvooziw2CI/AAAAAAAAARk/97Qd9tUbGKc/s1600-h/cuneane_sigs.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 224px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/Skvooziw2CI/AAAAAAAAARk/97Qd9tUbGKc/s320/cuneane_sigs.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5353628369788393506" /&gt;&lt;/a&gt;&lt;br /&gt;which shows 3D and 2D representation of the molecule (reproduced from the wiki page), and an arbitrary numbering in the center. The three signatures A, B, and C are shown labelled by this numbering.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What's interesting about this particular example is the number of times that the same atom is represented in the signatures. For the signature called 'B', which is rooted at atom 2, the atom numbered 5 appears four times in the lowest layer of the tree. This naturally follows from the large number of rings of different sizes that make up cuneane - two 5-membered rings, two 4-membered rings, and two three-membered rings. &lt;/div&gt;&lt;div&gt; &lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-7006258835435752540?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/7006258835435752540/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=7006258835435752540' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/7006258835435752540'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/7006258835435752540'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/06/cuneane.html' title='Cuneane'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_eL4nhrOF5R4/Skvooziw2CI/AAAAAAAAARk/97Qd9tUbGKc/s72-c/cuneane_sigs.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-7417558634317929523</id><published>2009-06-26T09:42:00.007+01:00</published><updated>2009-06-26T14:04:25.767+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='JCP'/><title type='text'>Two pass rendering</title><content type='html'>So, there was a question on the cdk-devel mailing list about bounding boxes, reactions, and text. An unfortunate consequence of the new design is that the renderer will not calculate bounding boxes that can fully contain the text. Concretely, this is what it would look like (not made in JCP!)&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/SkSL8oZ9b-I/AAAAAAAAAPk/Zt-3LdXtqGM/s1600-h/text_and_bounds.png"&gt;&lt;img style="display: block; margin-top: 0px; margin-right: auto; margin-bottom: 10px; margin-left: auto; cursor: pointer; width: 320px; height: 240px; text-align: center; " src="http://2.bp.blogspot.com/_eL4nhrOF5R4/SkSL8oZ9b-I/AAAAAAAAAPk/Zt-3LdXtqGM/s320/text_and_bounds.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5351556130977705954" /&gt;&lt;/a&gt;&lt;br /&gt;The blue box is the bounds that would be created, which is minimal with respect to the atom centers. The black box is the bounds that should be created, if we respected the text size. The problem is, the size of text is not known until the point it is drawn. Or, more precisely, until we have some sort of GraphicsContext to ask about the width in a particular font.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So, a two-pass system was suggested. When this was mentioned before, &lt;a href="http://pele.farmbio.uu.se/cgi-bin/bugzilla/show_bug.cgi?id=636"&gt;I was dismissive&lt;/a&gt; - perhaps even rude. Sorry about that Egon, Sam. I still think it is better avoided; in the case of transparency, I don't know why alpha values can't be used for fill colours. I understand there was some SWT problems..?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Anyway, here is a sketch of a possible two-pass system, that would allow some of these adjustments to be made:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/SkTHQ7YHWrI/AAAAAAAAAQU/HNEJmcizNXU/s1600-h/two_pass_design.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 210px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/SkTHQ7YHWrI/AAAAAAAAAQU/HNEJmcizNXU/s320/two_pass_design.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5351621350853663410" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;That's unreadably small in thumbnail - click for bigger, as usual. The basic idea would be to have one element tree with model-space values, and one with screen space values. I've made the distinction between double and integer, but Java2D will draw with doubles, so that is not important.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-7417558634317929523?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/7417558634317929523/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=7417558634317929523' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/7417558634317929523'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/7417558634317929523'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/06/two-pass-rendering.html' title='Two pass rendering'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_eL4nhrOF5R4/SkSL8oZ9b-I/AAAAAAAAAPk/Zt-3LdXtqGM/s72-c/text_and_bounds.png' height='72' width='72'/><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-6709435830128625152</id><published>2009-06-23T14:58:00.006+01:00</published><updated>2009-06-23T16:27:36.260+01:00</updated><title type='text'>More chemical signature example</title><content type='html'>A neat and tidy example of generating a structure from atoms and signatures occurred to me : adenine. It &lt;a href="http://www.nature.com/nature/journal/v191/n4794/abs/1911193a0.html"&gt;is known that&lt;/a&gt; this molecule forms readily from HCN under conditions similar to those thought to exist on the early Earth.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Anyway, the point is that the structure is almost exactly 6 C-N units (ignoring hydrogens and multiple bonds). A C-N is very similar to a height-1 signature, and the height-1 signatures for adenine are shown in the figure below:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/SkDwFXGW8fI/AAAAAAAAAPc/DZd-nRI0XWY/s1600-h/adenine_signature_example.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 222px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/SkDwFXGW8fI/AAAAAAAAAPc/DZd-nRI0XWY/s320/adenine_signature_example.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5350540332206846450" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;Each signature (&lt;b&gt;a&lt;/b&gt;-&lt;b&gt;e&lt;/b&gt;) is mapped to an atom in the original structure, and then a table is shown of the maximum number of compatible bonds possible between each pair of signatures. Note that the table is not symmetric, as the compatibility operation is not commutative. Also, notice that the diagonal is not all 0 - bonds can form between atoms that both have &lt;b&gt;b&lt;/b&gt; as a target signature.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Although these target signatures are smaller than the diameter of the adenine molecular graph, they still seem to define it completely. However, they are not guaranteed to produce only adenine as a solution - there may be other molecules with the same signature of (3&lt;b&gt;a&lt;/b&gt; + 2&lt;b&gt;b&lt;/b&gt;+3&lt;b&gt;c&lt;/b&gt;+&lt;b&gt;d&lt;/b&gt;+&lt;b&gt;e&lt;/b&gt;).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Edit: Hmmm. That's not actually adenine, is it? In fact, it doesn't seem to be in &lt;a href="http://www.ebi.ac.uk/chebi/init.do"&gt;ChEBI&lt;/a&gt; either. I probably should check my examples before posting them to the world.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-6709435830128625152?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/6709435830128625152/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=6709435830128625152' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/6709435830128625152'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/6709435830128625152'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/06/more-chemical-signature-example.html' title='More chemical signature example'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_eL4nhrOF5R4/SkDwFXGW8fI/AAAAAAAAAPc/DZd-nRI0XWY/s72-c/adenine_signature_example.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-6560317663628648963</id><published>2009-06-23T10:37:00.004+01:00</published><updated>2009-06-23T12:33:42.033+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='structure generation'/><title type='text'>Signature bond compatibility</title><content type='html'>So, given my previous posts on what Faulon's signatures are, here is an explanation of how they are used in the structure enumeration algorithm that I am &lt;a href="http://github.com/gilleain/generation/tree/master"&gt;almost finished implementing&lt;/a&gt;.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The core test in &lt;a href="http://pubs.acs.org/doi/abs/10.1021/ci020346o"&gt;this algorithm&lt;/a&gt; is for &lt;i&gt;compatible&lt;/i&gt; bonds. Two atoms are only joined if : a) they have compatible &lt;i&gt;target&lt;/i&gt; signatures and b) there are less than the target number of bonds already. A target signature here is just a signature that is set on the atom for it to match, like a pattern.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The first of these tests is illustrated here:&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/SkCliryt7EI/AAAAAAAAAPM/rUV0GCEWs0Q/s1600-h/bond_signature_compatible.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 234px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/SkCliryt7EI/AAAAAAAAAPM/rUV0GCEWs0Q/s320/bond_signature_compatible.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5350458372605733954" /&gt;&lt;/a&gt;&lt;/div&gt;Another (overly) complex diagram! But the formula here is a bit difficult to interpret otherwise. In the top left corner is a graph &lt;b&gt;G&lt;/b&gt; (slightly resembling hexane without the hydrogens) which is, by convention, composed of vertices (&lt;b&gt;V&lt;/b&gt;) and edges (&lt;b&gt;E&lt;/b&gt;).&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The equation to the right of the graph defines part of the condition for a compatible bond. The tau terms are just target signatures, as shown on the upper right. The tricky term is &lt;sup&gt;&lt;b&gt;h-1&lt;/b&gt;&lt;/sup&gt;&lt;b&gt;σ&lt;/b&gt;&lt;sub&gt;&lt;b&gt;τ(y)&lt;/b&gt;&lt;/sub&gt;&lt;b&gt;(z)&lt;/b&gt; which means 'a signature starting from the neighbour &lt;b&gt;z&lt;/b&gt; of &lt;b&gt;y&lt;/b&gt; in the subgraph defined by&lt;b&gt; τ(y)&lt;/b&gt;'. This requires using a target signature (&lt;span class="Apple-style-span" style="font-weight: bold; "&gt;τ(y)&lt;/span&gt;) as if it was a subgraph - shown on the bottom left for the target &lt;b&gt;b&lt;/b&gt; and the neighbour &lt;b&gt;n&lt;/b&gt;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The same process is repeated on the bottom right of the figure for &lt;b&gt;b&lt;/b&gt; and &lt;b&gt;m&lt;/b&gt; - which matches the height - 1 target signature for &lt;b&gt;c&lt;/b&gt;. This should make sense, since the atoms labelled with &lt;b&gt;b&lt;/b&gt; in the graph are attached to both &lt;b&gt;a&lt;/b&gt; and &lt;b&gt;c&lt;/b&gt; - so the signature for &lt;b&gt;b&lt;/b&gt; must be compatible with both. It is easy to check that &lt;b&gt;a&lt;/b&gt; and &lt;b&gt;c&lt;/b&gt; are not compatible, and cannot therefore be bonded.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-6560317663628648963?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/6560317663628648963/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=6560317663628648963' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/6560317663628648963'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/6560317663628648963'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/06/signature-bond-compatibility.html' title='Signature bond compatibility'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_eL4nhrOF5R4/SkCliryt7EI/AAAAAAAAAPM/rUV0GCEWs0Q/s72-c/bond_signature_compatible.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-8510080415321051883</id><published>2009-06-18T01:24:00.002+01:00</published><updated>2009-06-18T01:28:21.946+01:00</updated><title type='text'>The Voynich blog?</title><content type='html'>I apologise to any readers. I do tend to make complex diagrams. Perhaps some distant historian will find them in an archive and a new &lt;a href="http://www.voynich.nu/"&gt;voynich-style mystery&lt;/a&gt; will result..&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/SjmJps6DdxI/AAAAAAAAAO8/w7ofy6n22AI/s1600-h/voynich.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 300px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/SjmJps6DdxI/AAAAAAAAAO8/w7ofy6n22AI/s320/voynich.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5348457382001473298" /&gt;&lt;/a&gt;This does mean something, but you would have to read the previous post to discover what, exactly.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-8510080415321051883?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/8510080415321051883/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=8510080415321051883' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/8510080415321051883'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/8510080415321051883'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/06/voynich-blog.html' title='The Voynich blog?'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_eL4nhrOF5R4/SjmJps6DdxI/AAAAAAAAAO8/w7ofy6n22AI/s72-c/voynich.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-1631394818606922173</id><published>2009-06-17T23:11:00.012+01:00</published><updated>2009-06-18T02:18:37.572+01:00</updated><title type='text'>Faulon's Signatures : A Possible Interpretation</title><content type='html'>Several recent papers by Faulon concern an idea he calls 'signatures'. This post is just a record of what I understood them to be.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Firstly, a signature is a subgraph of a molecular graph. There is a distinction between atomic signatures - which is a tree rooted at a particular atom - and a molecular signature, which is the set of atomic signatures for each atom in a molecule.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A tree is a graph with no cycles, so an atomic signature is not just a subgraph. Like a path, a signature has a length - or rather a &lt;i&gt;height&lt;/i&gt;. Here is a picture of signatures of heights 1-4 for a fused ring structure:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_eL4nhrOF5R4/SjlvYCN8t0I/AAAAAAAAAN0/luFSS2U06cU/s1600-h/signature_heights.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 229px;" src="http://4.bp.blogspot.com/_eL4nhrOF5R4/SjlvYCN8t0I/AAAAAAAAAN0/luFSS2U06cU/s320/signature_heights.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5348428491182094146" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The graph &lt;b&gt;G&lt;/b&gt; on the left has one of its atoms labelled (a), and each of the trees in the center is a signature rooted at that atom. On the right, is the simple string form of the tree, as a nested list. I should point out that the signatures in these images may not be canonical, as I worked them out by hand (as I have not yet fully implemented the canonization algorithm).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Signatures of the same height may be different for the atoms in a molecule. At a height of zero, it is simply the atoms. A signature of height one is each atom, plus its neighbours. For &lt;b&gt;G&lt;/b&gt;, above, there are two distinct height-1 signatures. For greater heights in &lt;b&gt;G&lt;/b&gt;, there are more:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/Sjl2TKaXB4I/AAAAAAAAAOU/cCmx9yq3t9U/s1600-h/subgraphs_to_sigs.png" style="text-decoration: none;"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 233px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/Sjl2TKaXB4I/AAAAAAAAAOU/cCmx9yq3t9U/s320/subgraphs_to_sigs.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5348436104063682434" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;These are three subgraphs (&lt;b&gt;S&lt;/b&gt;&lt;sub&gt;&lt;b&gt;G&lt;/b&gt;&lt;/sub&gt;) of &lt;b&gt;G&lt;/b&gt;, rooted at three different atoms (a, b, c). Each one corresponds to a signature tree, which also correspond to different signature strings (not shown). The trees have been given square nodes, instead of circular ones, just to make them look different. From the symmetry of &lt;b&gt;G&lt;/b&gt;, it may be clear that the other atoms also have one of these same height-2 signatures.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Finally, there are some odd properties of the trees created from the subgraphs, that become noticable in height-3 signatures of &lt;b&gt;G&lt;/b&gt;. As mentioned above, a tree cannot have cycles, so when the paths radiating out from the root atom meet on the same atom, it will appear in the tree twice. Further, when paths cross the same bond - &lt;i&gt;at the same time&lt;/i&gt; - both atoms in the bond will appear in both orders across two layers:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/SjmVZjJgMlI/AAAAAAAAAPE/i0Tm-L6UBDI/s1600-h/even_odd_signatures.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 233px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/SjmVZjJgMlI/AAAAAAAAAPE/i0Tm-L6UBDI/s320/even_odd_signatures.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5348470298643542610" /&gt;&lt;/a&gt;&lt;br /&gt;The subgraph &lt;b&gt;S&lt;/b&gt;&lt;sub&gt;&lt;b&gt;G&lt;/b&gt; &lt;/sub&gt;shows the former case, by putting two new atoms corresponding to the duplicate visit to the bridging atom in &lt;b&gt;G&lt;/b&gt;. For the subgraph &lt;b&gt;S&lt;/b&gt;&lt;sub&gt;&lt;b&gt;H&lt;/b&gt;&lt;/sub&gt; of the pentagon &lt;b&gt;H&lt;/b&gt; the whole of the last bond visited is duplicated, and the signature tree has a pair of duplicate bonds at the leaves. The tree construction process forbids duplication of bonds except in these two ways.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-1631394818606922173?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/1631394818606922173/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=1631394818606922173' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/1631394818606922173'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/1631394818606922173'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/06/faulons-signatures-possible.html' title='Faulon&apos;s Signatures : A Possible Interpretation'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_eL4nhrOF5R4/SjlvYCN8t0I/AAAAAAAAAN0/luFSS2U06cU/s72-c/signature_heights.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-5263826167469376560</id><published>2009-06-16T18:08:00.007+01:00</published><updated>2009-06-16T18:58:20.438+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='structure generation'/><category scheme='http://www.blogger.com/atom/ns#' term='group theory'/><title type='text'>Automorphism groups and fragment graphs</title><content type='html'>Structure generation involves not just graph theory, but &lt;i&gt;group&lt;/i&gt; theory. Or, I should say, it does in some of the papers I have read. For example, in &lt;a href="http://pubs.acs.org/doi/abs/10.1021/ci020346o"&gt;this paper&lt;/a&gt; by J.L.Faulon, there is the sentence:&lt;blockquote&gt;"The two main steps are to compute the orbits of the automorphism group of G and to saturate all the atoms of a chosen orbit&lt;br /&gt;&lt;/blockquote&gt;which may well be incomprehensible to many readers, except if the reader is a mathematician.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I am no mathematician, but thanks to some books on groups, I now understand both what an &lt;a href="http://mathworld.wolfram.com/AutomorphismGroup.html"&gt;automorphism group&lt;/a&gt; is and what an &lt;a href="http://mathworld.wolfram.com/GroupOrbit.html"&gt;orbit&lt;/a&gt; is. On the other hand, I also believe that this definition of how the algorithm works is overly complex. A more simple term might just be "fragment sets" - as it is fairly clear, if not mathematically exact. So, for the fragment graph [CH&lt;sub&gt;3&lt;/sub&gt;, CH&lt;sub&gt;3&lt;/sub&gt;, CH&lt;sub&gt;2&lt;/sub&gt;, CH&lt;sub&gt;2&lt;/sub&gt;, CH, CH] the fragment set is [CH&lt;sub&gt;3&lt;/sub&gt;, CH&lt;sub&gt;2&lt;/sub&gt;, CH].&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Anyway, here is a short analysis of the automorphism group of the fragment graph [CH&lt;sub&gt;2&lt;/sub&gt;, CH&lt;sub&gt;2&lt;/sub&gt;]. This first image shows the tiny group of permutations that swaps the two fragments:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/SjfWFp7zSUI/AAAAAAAAANU/TWbwBen3Df4/s1600-h/automorph_swap.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 258px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/SjfWFp7zSUI/AAAAAAAAANU/TWbwBen3Df4/s320/automorph_swap.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5347978475170122050" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;The notation is taken from an excellent book called "&lt;a href="http://web.bentley.edu/empl/c/ncarter/vgt/"&gt;Visual Group Theory&lt;/a&gt;" that is also associated with some software called &lt;a href="http://groupexplorer.sourceforge.net/"&gt;group explorer&lt;/a&gt; on sourceforge. It might be quite general, I suppose (and I hope I'm using it right), but it shows the permutation that swaps the fragments as a circled s. This is an automorphism with respect to the edges - in other words, after the swap, there are still bonds between [1, 2], [2, 3], [4, 5], and [5-6].&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Another part of the automorphism group is a 'flip' like:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/SjfXp27yU3I/AAAAAAAAANc/n-QJIbc6JEM/s1600-h/automorph_flip.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 257px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/SjfXp27yU3I/AAAAAAAAANc/n-QJIbc6JEM/s320/automorph_flip.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5347980196646638450" /&gt;&lt;/a&gt;which is a little more complex, but shows how 'flipping' each fragment separately combines to form four possible permutations. If this does not seem particularly tricky, consider what happens if you take the &lt;a href="http://mathworld.wolfram.com/GroupDirectProduct.html"&gt;direct product&lt;/a&gt; of these two groups:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_eL4nhrOF5R4/SjfcBAcO7XI/AAAAAAAAANk/WQwY1I2ZjZk/s1600-h/automorph_product.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 239px;" src="http://4.bp.blogspot.com/_eL4nhrOF5R4/SjfcBAcO7XI/AAAAAAAAANk/WQwY1I2ZjZk/s320/automorph_product.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5347984992382152050" /&gt;&lt;/a&gt;Assuming I have done it right, this should show most (all?) of the automorphisms of the fragment graph. It does look pretty cool, but I don't think that it gets me any closer to implementing the cursed algorithm :)&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-5263826167469376560?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/5263826167469376560/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=5263826167469376560' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/5263826167469376560'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/5263826167469376560'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/06/automorphism-groups-and-fragment-graphs.html' title='Automorphism groups and fragment graphs'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_eL4nhrOF5R4/SjfWFp7zSUI/AAAAAAAAANU/TWbwBen3Df4/s72-c/automorph_swap.png' height='72' width='72'/><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-2537887690980279879</id><published>2009-05-23T11:05:00.003+01:00</published><updated>2009-05-23T11:09:18.520+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='structure generation'/><title type='text'>Simplest example : 1-methylcyclobutane</title><content type='html'>This is the simplest possible hydrocarbon example (there being none with four carbons) of multiple isomorphic solutions:&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/ShfLBqDKpJI/AAAAAAAAANM/1ptpur1rwsc/s1600-h/simple_counter_example.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 315px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/ShfLBqDKpJI/AAAAAAAAANM/1ptpur1rwsc/s320/simple_counter_example.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5338959112598889618" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;The only difference in numbering is the swap of 2 and 4. Oh well.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-2537887690980279879?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/2537887690980279879/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=2537887690980279879' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2537887690980279879'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2537887690980279879'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/05/simplest-example-1-methylcyclobutane.html' title='Simplest example : 1-methylcyclobutane'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_eL4nhrOF5R4/ShfLBqDKpJI/AAAAAAAAANM/1ptpur1rwsc/s72-c/simple_counter_example.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-1321461264008689791</id><published>2009-05-23T00:25:00.002+01:00</published><updated>2009-05-23T00:31:44.717+01:00</updated><title type='text'>Doesn't quite work</title><content type='html'>Sadly, using the degree sequence as a kind of 'mask' on canonically numbered matrices isn't enough. I should have guessed, it was too simple to be true :(&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The smiles for all matrices in the (2, 2, 2, 2, 2, 3, 3) partition are:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;p style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco"&gt;C1CC2CC(C1)C2&lt;/p&gt; &lt;p style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco"&gt;C1CC2CC(C1)C2&lt;/p&gt; &lt;p style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco"&gt;C1C2CC12.C1CC1&lt;/p&gt; &lt;p style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco"&gt;C1CC2CCC1C2&lt;/p&gt; &lt;p style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco"&gt;C1CC2CCC1C2&lt;/p&gt; &lt;p style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco"&gt;C1CC1CC2CC2&lt;/p&gt; &lt;p style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco"&gt;C1CCC2CC2(C1)&lt;/p&gt; &lt;p style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco"&gt;C1CCC2CC2(C1)&lt;/p&gt;&lt;p style="margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco"&gt;&lt;span class="Apple-style-span"   style="font-family:Georgia;font-size:130%;"&gt;&lt;span class="Apple-style-span" style="font-size: 16px;"&gt;&lt;span class="Apple-style-span"   style="font-family:Monaco;font-size:100%;"&gt;&lt;span class="Apple-style-span" style="font-size: 11px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;and (ignoring the third one, which is disconnected) there are only 3 or 4 different simple graphs in the list.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Perhaps I would have known all this if I knew what automorphisms were, or orbits of elements in sets, or any of that...&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-1321461264008689791?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/1321461264008689791/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=1321461264008689791' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/1321461264008689791'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/1321461264008689791'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/05/doesnt-quite-work.html' title='Doesn&apos;t quite work'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-3948882026686919155</id><published>2009-05-22T18:16:00.003+01:00</published><updated>2009-05-22T18:21:10.221+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='structure generation'/><title type='text'>Bridged Cyclohexanes - same partition, different PVR sequences</title><content type='html'>I wanted to find an example of two non-isomorphic molecules with the same degree sequence, but different PVR sequences. The simplest I could think of was this pair:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/ShbeWk0KjCI/AAAAAAAAANE/apoXgbKACVc/s1600-h/bridged_cyclohexanes.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 293px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/ShbeWk0KjCI/AAAAAAAAANE/apoXgbKACVc/s320/bridged_cyclohexanes.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5338698887715261474" /&gt;&lt;/a&gt;&lt;br /&gt;although there might be a similar pair of bridged 5-membered rings, I suppose.&lt;br /&gt;&lt;br /&gt;The meaning of this is quite simple - for a particular degree sequence (partition), there are multiple (different) molecules with a valid PVR sequence. I thought that this must be true, but there are none in the G4 set.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-3948882026686919155?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/3948882026686919155/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=3948882026686919155' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/3948882026686919155'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/3948882026686919155'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/05/bridged-cyclohexanes-same-partition.html' title='Bridged Cyclohexanes - same partition, different PVR sequences'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_eL4nhrOF5R4/ShbeWk0KjCI/AAAAAAAAANE/apoXgbKACVc/s72-c/bridged_cyclohexanes.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-8228526409010956123</id><published>2009-05-21T17:12:00.003+01:00</published><updated>2009-05-21T17:27:21.151+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='structure generation'/><title type='text'>Rows and Columns</title><content type='html'>Aha! So, a comment by &lt;a href="http://chem-bla-ics.blogspot.com/"&gt;Egon&lt;/a&gt; (on my &lt;a href="http://gilleain.blogspot.com/2009/05/pvr-numbering-scheme-not-solution-to.html"&gt;last post&lt;/a&gt;) showed the benefits of showing people what you do. He suggested summing the columns - not converting from binary to integers, as with the rows - to remove the last traces of redundancy. So this seems to work for graphs with 4 vertices:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/ShV-qUf1IBI/AAAAAAAAAM8/7yev_MQ5RIw/s1600-h/n4.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 114px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/ShV-qUf1IBI/AAAAAAAAAM8/7yev_MQ5RIw/s320/n4.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5338312198839410706" /&gt;&lt;/a&gt;&lt;br /&gt;Sorry for the extremely detailed diagram, but it is necessary to show my point. These matrix/graph pairs are all the PVR numbered adjacency matrices for n=4. There are isomorphic structures here, but note the column sums along the bottom of each matrix.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;These column sums form another sequence - which can be used to select only one of the isomorphs. Arbitrarily, we choose sequences that are partially ordered - i.e. no number in the sequence is less than the previous number in the sequence. This seems to work...&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-8228526409010956123?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/8228526409010956123/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=8228526409010956123' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/8228526409010956123'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/8228526409010956123'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/05/rows-and-columns.html' title='Rows and Columns'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_eL4nhrOF5R4/ShV-qUf1IBI/AAAAAAAAAM8/7yev_MQ5RIw/s72-c/n4.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-8151275827694302</id><published>2009-05-21T14:02:00.004+01:00</published><updated>2009-05-21T14:28:38.130+01:00</updated><title type='text'>PVR numbering scheme not solution to all woes : film at 11</title><content type='html'>On a whim, I decided to try generating all adjacency matrices with the property that they are &lt;a href="http://gilleain.blogspot.com/2008/10/on-canonical-numberings.html"&gt;PVR numbered&lt;/a&gt;. The short summary of that link is that a matrix can be expressed as a sequence of positive integers by considering each row of the matrix as a binary number.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The point of doing this (I thought) was that you can number a molecule in such a way that the adjacency matrix is PVR-numbered, and that this is canonical. So my cunning plan was to generate all sequences of n numbers that are partially ordered, choosing them from [1, 2&lt;sup&gt;n&lt;/sup&gt;] to give all non-redundant (simple) graphs with n vertices.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Unfortunately, it seems like this can't work:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/ShVVj2fj53I/AAAAAAAAAM0/ruTY6oGRRiM/s1600-h/redundancy.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 239px; height: 320px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/ShVVj2fj53I/AAAAAAAAAM0/ruTY6oGRRiM/s320/redundancy.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5338267007729264498" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This image shows all adjacency matrices for n = 3 which are PVR-numbered. They are made by backtracking through all sequences of integers with a partial order, pruning the solutions using the symmetry of the matrix as a constraint.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Anyway, the point is that the first two graphs are clearly isomorphic! More simply, they both represent propane. Maybe this is well known, but it's a surprise to me...&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-8151275827694302?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/8151275827694302/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=8151275827694302' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/8151275827694302'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/8151275827694302'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/05/pvr-numbering-scheme-not-solution-to.html' title='PVR numbering scheme not solution to all woes : film at 11'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_eL4nhrOF5R4/ShVVj2fj53I/AAAAAAAAAM0/ruTY6oGRRiM/s72-c/redundancy.png' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-5566426005301062752</id><published>2009-05-18T17:07:00.004+01:00</published><updated>2009-05-18T17:17:56.169+01:00</updated><title type='text'>Generation : Overview</title><content type='html'>To sum up the previous post flood; generation of constitutional isomers from the elemental formula can be done by generating all partitions of the total 'free' valence of the heavy atoms. The overall scheme is shown here:&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_eL4nhrOF5R4/ShGI9G4QxsI/AAAAAAAAAMs/cdSdBCC0HOI/s1600-h/partition_generation_overview.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 122px;" src="http://4.bp.blogspot.com/_eL4nhrOF5R4/ShGI9G4QxsI/AAAAAAAAAMs/cdSdBCC0HOI/s320/partition_generation_overview.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5337197616810739394" /&gt;&lt;/a&gt;&lt;br /&gt;(click for bigger, as usual). So, for each formula, multiple partitions can be made, and each of these makes multiple sub-partitions, and each of these correspond to one or more molecules.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Now, I won't pretend that any of this is particularly novel. I am no doubt re-expressing the problem of generating all possible molecules in a slightly different way. Having tried (and failed) to implement published methods, this was the best I could come up with.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I suspect that there are many improvements that could be made to the algorithm, and the implementation of it. Getting something that works, even in a limited way, seems like progress, however :)&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-5566426005301062752?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/5566426005301062752/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=5566426005301062752' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/5566426005301062752'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/5566426005301062752'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/05/generation-overview.html' title='Generation : Overview'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_eL4nhrOF5R4/ShGI9G4QxsI/AAAAAAAAAMs/cdSdBCC0HOI/s72-c/partition_generation_overview.png' height='72' width='72'/><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-7990831075826386927</id><published>2009-05-18T13:28:00.004+01:00</published><updated>2009-05-18T13:36:17.986+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='structure generation'/><title type='text'>Final step : nested partition thingies to actual molecules</title><content type='html'>So, the last step:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/ShFVaYOFyEI/AAAAAAAAAMc/PBH5p-A-N1I/s1600-h/nested_to_molecule.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 238px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/ShFVaYOFyEI/AAAAAAAAAMc/PBH5p-A-N1I/s320/nested_to_molecule.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5337140945077258306" /&gt;&lt;/a&gt;&lt;br /&gt;One thing to note about this is that the algorithm again has to backtrack to get all the molecules for any sub-partition list (another name for the things like [[3, 1], [3], [1, 1], [1]]).&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Anyway, on the right hand side is the final (only) molecule made for this nested partition. It has the [4, 3, 2, 1] partition structure, naturally, and the correct constitutional formula.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Now what would be nice, would be to combine the second and third steps, so that only those nested partitions that produce valid molecules were tried. However, as the saying goes : "First, make it work, then make it work fast".&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-7990831075826386927?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/7990831075826386927/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=7990831075826386927' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/7990831075826386927'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/7990831075826386927'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/05/final-step-nested-partition-thingies-to.html' title='Final step : nested partition thingies to actual molecules'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_eL4nhrOF5R4/ShFVaYOFyEI/AAAAAAAAAMc/PBH5p-A-N1I/s72-c/nested_to_molecule.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-567378167895281244</id><published>2009-05-18T12:48:00.003+01:00</published><updated>2009-05-18T12:57:54.924+01:00</updated><title type='text'>Partitions to er..'nested partitions'</title><content type='html'>So I don't have a good name for the objects that I create half-way through this process : the code uses 'solution', which is confusing. Anyway, here is the process:&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/ShFLf2dnVXI/AAAAAAAAAMM/9aqhmqqoEMc/s1600-h/partitions_to_nested.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 227px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/ShFLf2dnVXI/AAAAAAAAAMM/9aqhmqqoEMc/s320/partitions_to_nested.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5337130043978503538" /&gt;&lt;/a&gt;The left hand side is clear enough, I think, and follows on from the image in the previous post. Conceptually, it is similar to 'gathering' the attachment points into half-bonds in all possible ways. So, the 4 attachment points on the bare carbon fragment can become a triple (half) bond, and a single half bond. This is shown, as [3, 1].&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Of course, there are other possibilities, and each combination at each atom fragment has to be paired with each other possibility at each other fragment! If this sounds like a backtracking problem, then you might understand why I did exactly this in the code.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What would be nice, would be to prune the solutions - for example, [[3, 1], [2, 1], [1, 1], [1]] is generated, but is clearly impossible. Neither the triple half-bond, nor the double half-bond have partners. Pruning these would be easy enough; but looking more than one position ahead seems tricky.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-567378167895281244?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/567378167895281244/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=567378167895281244' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/567378167895281244'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/567378167895281244'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/05/partitions-to-ernested-partitions.html' title='Partitions to er..&apos;nested partitions&apos;'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_eL4nhrOF5R4/ShFLf2dnVXI/AAAAAAAAAMM/9aqhmqqoEMc/s72-c/partitions_to_nested.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-2804524793493543837</id><published>2009-05-18T11:51:00.004+01:00</published><updated>2009-05-18T12:39:07.329+01:00</updated><title type='text'>Formula to Partitions</title><content type='html'>&lt;div style="text-align: left;"&gt;For the benefit of my (2) readers, here is the process of going from the chemical formula to the partitions, and what this actually means.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 234px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/ShE-REGrDRI/AAAAAAAAAME/N7wQNisUixI/s320/formula_to_partition.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5337115496291175698" /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;So, the list of numbers at the bottom (the partition) is the simplest possible representation of the attachment points for each atom fragment.&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;Oh, and the python code for generating partitions for &lt;m,&gt; &lt;a href="http://gist.github.com/113422"&gt;is here&lt;/a&gt; - it is basically a straight copy of the code from the book "&lt;a href="http://www.math.mtu.edu/~kreher/cages.html"&gt;Combinatorial Algorithms : Generation, Enumeration, and Search&lt;/a&gt;", which I highly recommend - a good balance of maths and computer science.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-2804524793493543837?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/2804524793493543837/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=2804524793493543837' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2804524793493543837'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2804524793493543837'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/05/formula-to-partitions.html' title='Formula to Partitions'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_eL4nhrOF5R4/ShE-REGrDRI/AAAAAAAAAME/N7wQNisUixI/s72-c/formula_to_partition.png' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-5749583647240336978</id><published>2009-05-17T21:01:00.003+01:00</published><updated>2009-05-17T21:06:21.285+01:00</updated><title type='text'>Uniqified</title><content type='html'>&lt;div style="text-align: left;"&gt;Hmm. Not my favourite solution, but I think that the isomorphism checks can be done in batches, resulting in actual isomorph spaces, like this (for C5H10):&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 146px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/ShBthJcA8EI/AAAAAAAAAL8/XODWszgboJY/s320/uniqued.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5336885974670569538" /&gt;&lt;/div&gt;I should probably check that these are right...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-5749583647240336978?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/5749583647240336978/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=5749583647240336978' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/5749583647240336978'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/5749583647240336978'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/05/uniqified.html' title='Uniqified'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_eL4nhrOF5R4/ShBthJcA8EI/AAAAAAAAAL8/XODWszgboJY/s72-c/uniqued.png' height='72' width='72'/><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-4225419382232481108</id><published>2009-05-17T17:11:00.003+01:00</published><updated>2009-05-17T21:05:47.060+01:00</updated><title type='text'>Step one : catch them all</title><content type='html'>&lt;div&gt;Well, this is the C4H6 space again. Or &lt;10,4&gt; as I have taken to calling it, for no obvious reason.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/ShA3gkCEMYI/AAAAAAAAAL0/MB2vwPXGvHA/s1600-h/latest.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 166px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/ShA3gkCEMYI/AAAAAAAAAL0/MB2vwPXGvHA/s320/latest.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5336826591001719170" /&gt;&lt;/a&gt;There are duplicates, yes. But there are nine (unique) structures here, which is good.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-4225419382232481108?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/4225419382232481108/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=4225419382232481108' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/4225419382232481108'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/4225419382232481108'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/05/step-one-catch-them-all.html' title='Step one : catch them all'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_eL4nhrOF5R4/ShA3gkCEMYI/AAAAAAAAAL0/MB2vwPXGvHA/s72-c/latest.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-1541891184605846393</id><published>2009-04-30T14:37:00.007+01:00</published><updated>2009-04-30T16:20:43.302+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='cdk'/><category scheme='http://www.blogger.com/atom/ns#' term='2D molecule layout'/><category scheme='http://www.blogger.com/atom/ns#' term='bioclipse'/><title type='text'>Exploring the wild beasts of the layout jungle</title><content type='html'>There was a bug submitted to the CDK sourceforge tracker (bug number 2783741) with a list of molecules that are laid out badly. I had a look at some of them with the help of bioclipse. For example, this calix[4]arene:&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/SfmrT0KSddI/AAAAAAAAALM/T7MFXOEjnUE/s1600-h/calixarene.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 223px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/SfmrT0KSddI/AAAAAAAAALM/T7MFXOEjnUE/s320/calixarene.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5330479990877353426" /&gt;&lt;/a&gt;or this:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_eL4nhrOF5R4/SfmrxLRYIRI/AAAAAAAAALU/gLD1pGXGnfk/s1600-h/circulene.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 258px;" src="http://4.bp.blogspot.com/_eL4nhrOF5R4/SfmrxLRYIRI/AAAAAAAAALU/gLD1pGXGnfk/s320/circulene.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5330480495297306898" /&gt;&lt;/a&gt;which is a clearer case of something going wrong. More difficult is structures that are fully 3D, like:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/SfmsGeMBQzI/AAAAAAAAALc/OxM6spb4EPc/s1600-h/para2d.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 221px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/SfmsGeMBQzI/AAAAAAAAALc/OxM6spb4EPc/s320/para2d.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5330480861152363314" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;Can you guess what it is? :) Try the 3D version (also made with bioclipse, using the CDK 3D layout):&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/SfmsuN5DOYI/AAAAAAAAALk/ayREiUg19vU/s1600-h/para3d.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 214px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/SfmsuN5DOYI/AAAAAAAAALk/ayREiUg19vU/s320/para3d.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5330481543972600194" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It's a paracyclophane! The phenyl rings are lost in the 2D layout because there is a bigger 'ring'. Perhaps a chemist would look at the 3D structure and think that those chains are linkers, not parts of a ring, but the algorithm doesn't know this.&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I think that it is difficult to have general rules for this. Of course, any fully 3D structure will be difficult to lay out in 2D (if it is not embeddable in the plane then it is impossible) so things like this:&lt;/div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/SfnBlWQeX8I/AAAAAAAAALs/r1lEyv-c0YM/s1600-h/example1.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 263px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/SfnBlWQeX8I/AAAAAAAAALs/r1lEyv-c0YM/s320/example1.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5330504481343692738" /&gt;&lt;/a&gt;&lt;br /&gt;are truly awful.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-1541891184605846393?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/1541891184605846393/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=1541891184605846393' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/1541891184605846393'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/1541891184605846393'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/04/exploring-wild-beasts-of-layout-jungle.html' title='Exploring the wild beasts of the layout jungle'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_eL4nhrOF5R4/SfmrT0KSddI/AAAAAAAAALM/T7MFXOEjnUE/s72-c/calixarene.png' height='72' width='72'/><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-3942719872136208325</id><published>2009-04-27T23:51:00.004+01:00</published><updated>2009-04-28T00:03:50.183+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='bioclipse'/><title type='text'>Bioclipse : Safe When Used As Directed</title><content type='html'>&lt;div style="text-align: left;"&gt;Finally used bioclipse for a real purpose, and to good effect, too:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 199px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/SfY3VnhMDRI/AAAAAAAAALE/iytJj169qrc/s320/usage.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5329508053564525842" /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;what this shows (the images do get larger if you click on them! :) is the following basic workflow:&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;1) Exploring manager functions in the js console (bottom).&lt;/div&gt;&lt;div style="text-align: left;"&gt;2) Writing a script in the js editor (top left).&lt;/div&gt;&lt;div style="text-align: left;"&gt;3) Running and getting feedback in the rhino console (far right).&lt;/div&gt;&lt;div style="text-align: left;"&gt;4) Viewing the results in the sdf viewer (top right).&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;What I was doing was searching through an sdf file (C10H16_filtered.sdf) for all structures with a cyclohexane ring as a substructure, then writing those out to a file.&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;Probably could have been done 5 other ways, but, well, it was more fun this way.&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;Oh, and it is &lt;a href="http://gist.github.com/102810"&gt;a gist here&lt;/a&gt;.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-3942719872136208325?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/3942719872136208325/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=3942719872136208325' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/3942719872136208325'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/3942719872136208325'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/04/bioclipse-safe-when-used-as-directed.html' title='Bioclipse : Safe When Used As Directed'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_eL4nhrOF5R4/SfY3VnhMDRI/AAAAAAAAALE/iytJj169qrc/s72-c/usage.png' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-6554030950177521597</id><published>2009-04-23T12:08:00.003+01:00</published><updated>2009-04-23T12:11:24.194+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='ChEBI'/><category scheme='http://www.blogger.com/atom/ns#' term='bioclipse'/><title type='text'>ChEBI in Bioclipse</title><content type='html'>&lt;div style="text-align: left;"&gt;Says it all, really.&lt;/div&gt;&lt;div&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 205px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/SfBMljzNmOI/AAAAAAAAAK8/o0agwWr7j9c/s320/chebi_in_bioclipse.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5327842567328209122" /&gt;&lt;/div&gt;&lt;div&gt;The scrolling can hang if you move too fast, which might be due to garbage collection? It is a very large file....&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-6554030950177521597?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/6554030950177521597/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=6554030950177521597' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/6554030950177521597'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/6554030950177521597'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/04/chebi-in-bioclipse.html' title='ChEBI in Bioclipse'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_eL4nhrOF5R4/SfBMljzNmOI/AAAAAAAAAK8/o0agwWr7j9c/s72-c/chebi_in_bioclipse.png' height='72' width='72'/><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-511026530159678124</id><published>2009-04-17T01:07:00.002+01:00</published><updated>2009-04-17T01:08:32.407+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='structure generation'/><title type='text'>Portable whiteboard : deployed</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/SefIcHbF1XI/AAAAAAAAAK0/kO2PP3uQT8k/s1600-h/Photo+9.jpg"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 240px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/SefIcHbF1XI/AAAAAAAAAK0/kO2PP3uQT8k/s320/Photo+9.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5325445469743469938" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-511026530159678124?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/511026530159678124/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=511026530159678124' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/511026530159678124'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/511026530159678124'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/04/portable-whiteboard-deployed.html' title='Portable whiteboard : deployed'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_eL4nhrOF5R4/SefIcHbF1XI/AAAAAAAAAK0/kO2PP3uQT8k/s72-c/Photo+9.jpg' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-7087590034266122958</id><published>2009-04-15T17:25:00.003+01:00</published><updated>2009-04-15T18:41:19.261+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='cdkws2009'/><title type='text'>Proposed CDK changes related to PDBReader and BioPolymer</title><content type='html'>This is expanding on one of the points that Rajarshi made &lt;a href="http://blog.rguha.net/?p=249"&gt;in his blog&lt;/a&gt; (which he followed up &lt;a href="http://blog.rguha.net/?p=257"&gt;here&lt;/a&gt;) on the PDB  file handling capabilities of the CDK. There are two related topics : reading of PDB format files (the ancient, fixed column-width ATOM files) and the model that these are read into.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;PDBReader&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The old PDB format is being replaced with mmCIF and/or PDBML formats. Only there are lots of programs that write out this format, so it makes sense to still support it for a while at least.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;However, it is a quite nasty format, in some ways. Not so much the fixed column width, but the fact that crystallographers abuse the file format in all sorts of ways. Even simple things like expecting that atom numbers will always increase, may not be true.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So it is not easy making a good reader for PDB files. The current CDK one won't read a file with just ATOM records, for example. Think that's reasonable? Well, tough luck for people that made programs that produce simple files like this.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A more serious problem is the fact that you can't get properly connected ligands from a PDB file. Or easily get at the disordered regions. Or get at the waters. Well, sort of - I suppose that many of these things can be done after reading, with CDK classes.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;BioPolymer&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In some ways - so long as the atoms are read in - anything can be done to the model post-reading. However, the point of having data model classes like Polymer and BioPolymer is to capture some of the complexity of the macromolecule's organisation.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;BioPolymer and the PDBReader do a good job of reading and storing the information in the header files (except that it is not always right!). Apart from calling chains 'Strands', some things are done reasonably well. I don't think that ligand atoms should be stored 'loose' as they are, but probably in referenced atom containers.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The real difficulty with modelling proteins lies with representing the hierarchy. One way is 'PMCA' - protein, model, chain, atom. This misses out secondary structure, but it may be too literal to have objects for every concept; the CDK stores the secondary structure given in the header file as IPDBStructure objects - with insertion codes, which is good to see.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;However, there are more secondary structures than helix, turn, and strand. I'm never quite sure what the best way to model the more flexible situation, though.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Integration with biojava&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It seems a shame that several open source java projects have very little in the way of integration (that I can see). At least on this topic. Of course, Egon (among others) has done work on both both CDK and Jmol, but it concerns me that incompatible ways of doing things lead to projects drifting apart, that should work together.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For example, &lt;a href="http://gist.github.com/95758"&gt;here is a class I wrote today&lt;/a&gt; (for someone else) that uses both biojava and the CDK. It is very much a hack, but the key point is the method &lt;span class="Apple-style-span" style="font-family: 'Bitstream Vera Sans Mono'; color: rgb(153, 0, 0); font-size: 12px; font-weight: bold; line-height: 17px; white-space: pre; "&gt;makeLigandsFromGroups &lt;span class="Apple-style-span" style="color: rgb(0, 0, 0); font-family: Georgia; font-size: 16px; font-weight: normal; line-height: normal; white-space: normal; "&gt; that takes both a List of biojava.Group objects and a List of IMolecules, along the way converting biojava Atoms to CDK Atoms.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Bitstream Vera Sans Mono'; color: rgb(153, 0, 0); font-size: 12px; font-weight: bold; line-height: 17px; white-space: pre; "&gt;&lt;span class="Apple-style-span" style="color: rgb(0, 0, 0); font-family: Georgia; font-size: 16px; font-weight: normal; line-height: normal; white-space: normal; "&gt;Clearly, biojava has a better interface to its model as you can get a List of the hetatm groups. The CDK, on the other hand, has better support for determining atom types and setting properties on them.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-7087590034266122958?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/7087590034266122958/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=7087590034266122958' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/7087590034266122958'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/7087590034266122958'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/04/proposed-cdk-changes-related-to.html' title='Proposed CDK changes related to PDBReader and BioPolymer'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-2424992790966796570</id><published>2009-04-09T16:47:00.001+01:00</published><updated>2009-04-09T16:49:50.240+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='structure generation'/><title type='text'>Generated Wallpaper, anyone?</title><content type='html'>Heh.&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_eL4nhrOF5R4/Sd4YxnR1SbI/AAAAAAAAAKs/OctZK8Ug2Tw/s1600-h/wallpaper.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 225px;" src="http://4.bp.blogspot.com/_eL4nhrOF5R4/Sd4YxnR1SbI/AAAAAAAAAKs/OctZK8Ug2Tw/s320/wallpaper.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5322719050234939826" /&gt;&lt;/a&gt;This is exhaustive generation of all C4Hn (where n=any number of hydrogens), badly smashed together onto one canvas.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-2424992790966796570?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/2424992790966796570/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=2424992790966796570' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2424992790966796570'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2424992790966796570'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/04/generated-wallpaper-anyone.html' title='Generated Wallpaper, anyone?'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_eL4nhrOF5R4/Sd4YxnR1SbI/AAAAAAAAAKs/OctZK8Ug2Tw/s72-c/wallpaper.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-5298297785359871638</id><published>2009-04-08T17:43:00.004+01:00</published><updated>2009-04-08T20:08:48.353+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='structure generation'/><title type='text'>Structure zoo</title><content type='html'>So, I'm getting better at generating structures :&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/SdzWAU2hCQI/AAAAAAAAAKU/jtezeZikWwo/s1600-h/zoo.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 149px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/SdzWAU2hCQI/AAAAAAAAAKU/jtezeZikWwo/s320/zoo.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5322364160730794242" /&gt;&lt;/a&gt;but, not quite there yet. Oh, and the [1.1.1] propellane is drawn oddly (with one of its carbons at 0,0) due to a mac-specific bug in the layout. Irritating, but difficult to fix.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;edit: Oh, and for interests sake, here is an auto-generated tree of graphs&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/SdzenPonhPI/AAAAAAAAAKc/LSLxvlY8GhI/s1600-h/graph_tree.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 222px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/SdzenPonhPI/AAAAAAAAAKc/LSLxvlY8GhI/s320/graph_tree.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5322373625438242034" /&gt;&lt;/a&gt;Even more complex looking is this! :&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/Sdz14LtnACI/AAAAAAAAAKk/MbRwNXR641Q/s1600-h/graph_molecules.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 164px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/Sdz14LtnACI/AAAAAAAAAKk/MbRwNXR641Q/s320/graph_molecules.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5322399205210652706" /&gt;&lt;/a&gt;which is a set of structures produced by descending to depth 6 for 5-carbon graphs.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-5298297785359871638?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/5298297785359871638/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=5298297785359871638' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/5298297785359871638'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/5298297785359871638'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/04/structure-zoo.html' title='Structure zoo'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_eL4nhrOF5R4/SdzWAU2hCQI/AAAAAAAAAKU/jtezeZikWwo/s72-c/zoo.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-6548653717830027081</id><published>2009-04-03T14:42:00.003+01:00</published><updated>2009-04-03T15:18:01.496+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='structure generation'/><title type='text'>Numbering atoms, numbering vertices</title><content type='html'>&lt;div&gt;Further to the similarities between numbering atoms in a structure, and generating unique graphs here is this:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/SdYS3W7MD6I/AAAAAAAAAKM/PTVWyFLlQno/s1600-h/numbering_linear_graphs.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 206px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/SdYS3W7MD6I/AAAAAAAAAKM/PTVWyFLlQno/s320/numbering_linear_graphs.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5320460752040759202" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;div&gt;which shows the same molecule with two different numberings on the left, and the resulting graphs on the right. The double bond is not shown on the graphs; but it would probably have to be a labelling of the edge, rather than an actual multiple edge, to still be a simple graph.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So, this quickly shows how - if you start with the vertices of the graph and connect 'all possible ways' - you get molecules that are isomorphic, but numbered differently. Therefore (perhaps) the numbering of the vertices and edges is one of he keys to not creating all the isomorphs and then having to expensively check them all.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-6548653717830027081?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/6548653717830027081/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=6548653717830027081' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/6548653717830027081'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/6548653717830027081'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/04/numbering-atoms-numbering-vertices.html' title='Numbering atoms, numbering vertices'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_eL4nhrOF5R4/SdYS3W7MD6I/AAAAAAAAAKM/PTVWyFLlQno/s72-c/numbering_linear_graphs.png' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-969885736917119331</id><published>2009-04-03T14:05:00.003+01:00</published><updated>2009-04-03T14:39:17.863+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='structure generation'/><title type='text'>Step 3 : Extend all possible ways</title><content type='html'>The title of this post refers to the tendency of algorithms in papers to have detailed explanation of every step except the most crucial one. Right now I am rediscovering this peculiar pleasure in structure generation.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;As an example - or more as visual decoration - have this image:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/SdYLK98VmyI/AAAAAAAAAKE/ibnG-an8D9U/s1600-h/tetra_graphs.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 242px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/SdYLK98VmyI/AAAAAAAAAKE/ibnG-an8D9U/s320/tetra_graphs.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5320452292839054114" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;which looks nice, but needs some explanation. The diagrams in boxes that look like parachutes (as one of my colleagues put it :) are simple representations of atoms connected by bonds. Each point is an atom, and a curved line connecting them is a bond.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;These are grouped together by what structure they correspond to; which is shown on the right of each set of diagrams. What this shows, then, is the redundancy you get from a simple generator. If you connect all atoms like this:&lt;/div&gt;&lt;pre&gt;  for atomA in atoms:&lt;br /&gt;  for atomB in atoms greater than atomA:&lt;br /&gt;    connect(atomA, atomB)&lt;/pre&gt;&lt;div&gt;You quickly get a very large number of isomorphic structures. And then your process runs out of memory, in my experience.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Oh, and the numbers below each graph are the sum of the indegree/outdegree (the degree?) of the that vertex - or the number of bonds the atom is in, in other words. This seems similar to the ideas behind Morgan numbers, which I now understand a bit better.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-969885736917119331?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/969885736917119331/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=969885736917119331' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/969885736917119331'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/969885736917119331'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/04/step-3-extend-all-possible-ways.html' title='Step 3 : Extend all possible ways'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_eL4nhrOF5R4/SdYLK98VmyI/AAAAAAAAAKE/ibnG-an8D9U/s72-c/tetra_graphs.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-8169466174100443742</id><published>2009-03-12T14:50:00.003Z</published><updated>2009-03-12T15:08:03.217Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='structure generation'/><title type='text'>Symmetric Generations</title><content type='html'>I've been trying to work on an implementation of a structure generator algorithm due to Faulon (its a paper in &lt;a href="http://www.cs.sandia.gov/~jfaulon/publication.html"&gt;this list&lt;/a&gt; somewhere). One problem I have difficulty with is reduction of the number of isomers generated. For example:&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/SbkiYvb5DLI/AAAAAAAAAJ8/bvXAQFHevnI/s1600-h/cyclo_butene.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 233px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/SbkiYvb5DLI/AAAAAAAAAJ8/bvXAQFHevnI/s320/cyclo_butene.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5312315043905866930" /&gt;&lt;/a&gt;It may be hard to read, but the idea is that the full tree of possible ways to attach {Br, Cl, F, I} to c1ccc1 is 4 * 3 * 2 * 1 (you can attach Bromine to all four of the carbons, leaving 3 places to attach chlorine, and so on).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Of course c1(Br)c(Cl)cc1 is the same as c1(Cl)c(Br)cc1 [never mind that C1(Br)=C(Cl)C=C1 is not the same as C1=C(Br)C(Cl)=C1]. Or, in other words, there is a high degree of symmetry in the tree.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There is a way to solve this, perhaps even described in the paper - if I could just understand them...&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-8169466174100443742?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/8169466174100443742/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=8169466174100443742' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/8169466174100443742'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/8169466174100443742'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/03/symmetric-generations.html' title='Symmetric Generations'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_eL4nhrOF5R4/SbkiYvb5DLI/AAAAAAAAAJ8/bvXAQFHevnI/s72-c/cyclo_butene.png' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-5534296533803273521</id><published>2009-02-25T18:43:00.005Z</published><updated>2009-02-25T19:36:54.799Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='isomerspaces'/><category scheme='http://www.blogger.com/atom/ns#' term='JChemPaint'/><title type='text'>Multi-view</title><content type='html'>A quick example of a mini-application made using the new JChempaint renderer:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/SaWRioqkB3I/AAAAAAAAAJk/pWb9Wl-_MIE/s1600-h/multiview.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 320px; height: 225px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/SaWRioqkB3I/AAAAAAAAAJk/pWb9Wl-_MIE/s320/multiview.png" alt="" id="BLOGGER_PHOTO_ID_5306807760143517554" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;the picture is of an isomer space (C&lt;sub&gt;3&lt;/sub&gt;H&lt;sub&gt;7&lt;/sub&gt;NO), with compact mode on and atoms rendered as circles. The code is here, for now:&lt;br /&gt;&lt;br /&gt;http://gist.github.com/70342&lt;br /&gt;&lt;br /&gt;which is a kind of dumb way to achieve this, as it creates an instance of a renderer for each molecule, instead of adding them to a molecule set, and then laying that out...&lt;br /&gt;&lt;br /&gt;edit: just realised; it's rendering the hydrogens as compact black circles :(&lt;br /&gt;&lt;br /&gt;edit2: Ahhh. that's better:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_eL4nhrOF5R4/SaWV7Rg7JpI/AAAAAAAAAJs/3TAnR_iPxIk/s1600-h/multiview.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 320px; height: 222px;" src="http://4.bp.blogspot.com/_eL4nhrOF5R4/SaWV7Rg7JpI/AAAAAAAAAJs/3TAnR_iPxIk/s320/multiview.png" alt="" id="BLOGGER_PHOTO_ID_5306812581472315026" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;edit3: Ha! This was a mistake, but it looks kind of cool:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_eL4nhrOF5R4/SaWdo6FpwpI/AAAAAAAAAJ0/xVm9tTfV8M0/s1600-h/superimpose.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 202px; height: 320px;" src="http://4.bp.blogspot.com/_eL4nhrOF5R4/SaWdo6FpwpI/AAAAAAAAAJ0/xVm9tTfV8M0/s320/superimpose.png" alt="" id="BLOGGER_PHOTO_ID_5306821062039290514" border="0" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-5534296533803273521?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/5534296533803273521/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=5534296533803273521' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/5534296533803273521'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/5534296533803273521'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/02/multi-view.html' title='Multi-view'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_eL4nhrOF5R4/SaWRioqkB3I/AAAAAAAAAJk/pWb9Wl-_MIE/s72-c/multiview.png' height='72' width='72'/><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-314272781067821851</id><published>2009-02-14T19:06:00.007Z</published><updated>2009-02-14T20:32:52.094Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='JChemPaint'/><title type='text'>lone pair rendering</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_eL4nhrOF5R4/SZcpCEd2A2I/AAAAAAAAAJM/C-5kbIvThlY/s1600-h/lp_plus_radical.png"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 208px; height: 320px;" src="http://4.bp.blogspot.com/_eL4nhrOF5R4/SZcpCEd2A2I/AAAAAAAAAJM/C-5kbIvThlY/s320/lp_plus_radical.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5302752201787442018" /&gt;&lt;/a&gt;JCP now has minimal lone pair display. I would prefer the layout to be at the corners of a square, rather than on the edges. Videlicet, they are currently only at N, W, S, E; I think that NW, SW, SW, and NE would be better.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Strangely lone pairs don't seem to appear in CML files when written out, but Egon says he will look at this. Perhaps I should file a bug report...&lt;/div&gt;&lt;div&gt;Oh, and radicals are implemented too, &lt;s&gt;but I don't have a picture of that &lt;/s&gt;(see right). They are in different generators, but I guess a single 'DotGenerator' could do both :)&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/SZcpjV0004I/AAAAAAAAAJc/vWxdJGLW39M/s1600-h/lp.png"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 233px; height: 320px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/SZcpjV0004I/AAAAAAAAAJc/vWxdJGLW39M/s320/lp.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5302752773382919042" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-314272781067821851?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/314272781067821851/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=314272781067821851' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/314272781067821851'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/314272781067821851'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/02/lone-pair-rendering.html' title='lone pair rendering'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_eL4nhrOF5R4/SZcpCEd2A2I/AAAAAAAAAJM/C-5kbIvThlY/s72-c/lp_plus_radical.png' height='72' width='72'/><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-8866128039573941920</id><published>2009-01-29T20:40:00.005Z</published><updated>2009-01-30T14:15:55.885Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='NMR spectrum comparison'/><title type='text'>Spectral Full House</title><content type='html'>So, all of the isomers of C4H11O are in &lt;a href="http://nmrshiftdb.ice.mpg.de/"&gt;NMRShiftDB&lt;/a&gt; and here are all the experimental and predicted carbon spectra:&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/SYIXymc-ZoI/AAAAAAAAAIs/BWKjmGqFz-w/s1600-h/C4H11O_spectra.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 268px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/SYIXymc-ZoI/AAAAAAAAAIs/BWKjmGqFz-w/s320/C4H11O_spectra.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5296822269824099970" /&gt;&lt;/a&gt;It's not obvious from this picture, but not all of the predicted spectra are unique matches for their experimental partners. In other words, you could not pick out the right molecule by comparing the predicted and experimental spectra.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The situation is more difficult still for larger isomer spaces, where the predicted spectra may be exactly the same for sub-sets of the isomers. There are still many with unique predictions, but the rest follow a sort of power-law distribution of spectral-equivalent sets.&lt;/div&gt;&lt;br /&gt;EDIT: As per a suggestion by egon, here is a table of top hits (a yellow square indicates the top match):&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/SYMLV902rzI/AAAAAAAAAI0/oulZFqoSKgs/s1600-h/match_grid.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 239px; height: 320px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/SYMLV902rzI/AAAAAAAAAI0/oulZFqoSKgs/s320/match_grid.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5297090058719244082" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-8866128039573941920?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/8866128039573941920/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=8866128039573941920' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/8866128039573941920'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/8866128039573941920'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/01/spectral-full-house.html' title='Spectral Full House'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_eL4nhrOF5R4/SYIXymc-ZoI/AAAAAAAAAIs/BWKjmGqFz-w/s72-c/C4H11O_spectra.png' height='72' width='72'/><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-1283872278301527336</id><published>2009-01-16T18:38:00.004Z</published><updated>2009-01-16T18:41:00.502Z</updated><title type='text'>C4H11N network, labelled with pictures</title><content type='html'>A really tiny isomer set, with a particularly regular network:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/SXDUUPq-GwI/AAAAAAAAAIc/fKFB1U9Wo6g/s1600-h/c4h11n_network.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 213px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/SXDUUPq-GwI/AAAAAAAAAIc/fKFB1U9Wo6g/s320/c4h11n_network.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5291963006429567746" /&gt;&lt;/a&gt;&lt;br /&gt;Nicely enough, all 8 of these structures are in NMRShiftDB, so it will be possible to compare all against all.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-1283872278301527336?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/1283872278301527336/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=1283872278301527336' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/1283872278301527336'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/1283872278301527336'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/01/c4h11n-network-labelled-with-pictures.html' title='C4H11N network, labelled with pictures'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_eL4nhrOF5R4/SXDUUPq-GwI/AAAAAAAAAIc/fKFB1U9Wo6g/s72-c/c4h11n_network.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-908708449316258515</id><published>2009-01-14T15:12:00.004Z</published><updated>2009-01-14T15:14:46.536Z</updated><title type='text'>One with smiles strings on</title><content type='html'>Slightly more useful/comprehensible.&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/SW4A9TA_XrI/AAAAAAAAAIU/qtOfkXuF1sI/s1600-h/c3h7no.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 320px; height: 202px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/SW4A9TA_XrI/AAAAAAAAAIU/qtOfkXuF1sI/s320/c3h7no.png" alt="" id="BLOGGER_PHOTO_ID_5291167665283358386" border="0" /&gt;&lt;/a&gt;Also, this is a more interesting isomer space (C3H7ON) which is more fragmented at a 50% similarity cutoff.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-908708449316258515?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/908708449316258515/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=908708449316258515' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/908708449316258515'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/908708449316258515'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/01/one-with-smiles-strings-on.html' title='One with smiles strings on'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_eL4nhrOF5R4/SW4A9TA_XrI/AAAAAAAAAIU/qtOfkXuF1sI/s72-c/c3h7no.png' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-3873055273218380265</id><published>2009-01-12T20:12:00.003Z</published><updated>2009-01-12T20:14:49.788Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='JUNG'/><category scheme='http://www.blogger.com/atom/ns#' term='cdk'/><title type='text'>just one more</title><content type='html'>C8H16 this time. The central bridge has split into two parts, it seems. The clustering is somewhat artificial, I suppose...&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/SWukQeJ0SVI/AAAAAAAAAIM/vPhbg4QyYaQ/s1600-h/c8h16.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 320px; height: 185px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/SWukQeJ0SVI/AAAAAAAAAIM/vPhbg4QyYaQ/s320/c8h16.png" alt="" id="BLOGGER_PHOTO_ID_5290502790155880786" border="0" /&gt;&lt;/a&gt;An example from one of these is C=C(C(C)C)C(C)C, and from the other is C=CC(CC)CCC. Hmmm.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-3873055273218380265?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/3873055273218380265/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=3873055273218380265' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/3873055273218380265'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/3873055273218380265'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/01/just-one-more.html' title='just one more'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_eL4nhrOF5R4/SWukQeJ0SVI/AAAAAAAAAIM/vPhbg4QyYaQ/s72-c/c8h16.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-4123376209558426401</id><published>2009-01-12T19:30:00.003Z</published><updated>2009-01-12T19:35:41.157Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='JUNG'/><category scheme='http://www.blogger.com/atom/ns#' term='cdk'/><title type='text'>another network</title><content type='html'>Another one (C7H14 this time):&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/SWuapHCUpnI/AAAAAAAAAIE/fv35Z2s43jk/s1600-h/c7h14.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 320px; height: 178px;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/SWuapHCUpnI/AAAAAAAAAIE/fv35Z2s43jk/s320/c7h14.png" alt="" id="BLOGGER_PHOTO_ID_5290492218330883698" border="0" /&gt;&lt;/a&gt;It's a bit cumbersome, but I managed to get smiles strings to show as tooltip text. This tells me that the three vertices in the center of this picture (the bridging ones) are C=C(C)CCCC, C=C(CC)C(C)C, and C=C(CC)CCC. Not sure why.&lt;br /&gt;&lt;br /&gt;Oh, and the 'lobe' on the left seem to be (all?) non-cyclic, while the ones on the right seem to be cyclic, which makes sense.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-4123376209558426401?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/4123376209558426401/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=4123376209558426401' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/4123376209558426401'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/4123376209558426401'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/01/another-network.html' title='another network'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_eL4nhrOF5R4/SWuapHCUpnI/AAAAAAAAAIE/fv35Z2s43jk/s72-c/c7h14.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-8555714733550013762</id><published>2009-01-12T17:05:00.004Z</published><updated>2009-01-12T17:14:10.840Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='JUNG'/><category scheme='http://www.blogger.com/atom/ns#' term='cdk'/><category scheme='http://www.blogger.com/atom/ns#' term='molgen'/><title type='text'>isomer networks</title><content type='html'>hmmm. Been a month. Oh well. Here is a picure of an 'isomer network':&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/SWt4sIM_YCI/AAAAAAAAAH8/wdmcrFGVqAM/s1600-h/c6h12.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 320px; height: 182px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/SWt4sIM_YCI/AAAAAAAAAH8/wdmcrFGVqAM/s320/c6h12.png" alt="" id="BLOGGER_PHOTO_ID_5290454886788325410" border="0" /&gt;&lt;/a&gt;for the isomers of C6H12. I generated them using &lt;a href="http://www.molgen.de/"&gt;molgen&lt;/a&gt;, compared the structures all-v-all with the &lt;a href="http://sourceforge.net/projects/cdk/"&gt;CDK&lt;/a&gt;, and visualized the connectivity graph with the help of &lt;a href="http://jung.sourceforge.net/"&gt;JUNG&lt;/a&gt;. So hardly any work actually by me...&lt;br /&gt;&lt;br /&gt;Each vertex in the graph is a structure, and each edge is a tanimoto similarity between fingerprints of greater than 0.5 - fairly arbitrary, but I just wanted to see if it worked. The next step is to use predicted spectrum similarity instead of molecular-fingerprint similarity.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-8555714733550013762?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/8555714733550013762/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=8555714733550013762' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/8555714733550013762'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/8555714733550013762'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2009/01/isomer-networks.html' title='isomer networks'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_eL4nhrOF5R4/SWt4sIM_YCI/AAAAAAAAAH8/wdmcrFGVqAM/s72-c/c6h12.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-1954192222492491572</id><published>2008-12-09T17:47:00.003Z</published><updated>2008-12-09T18:25:50.465Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='eclipse'/><title type='text'>Why you can't make custom progress monitors in eclipse</title><content type='html'>I wanted to make a custom progress monitor with a temperature graph in Eclipse, but it does not seem to be possible. Here's why:&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/ST6vxz7pCTI/AAAAAAAAAHU/0Fmv5zexEn0/s1600-h/why_you_cant_have_custom_monitors.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 209px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/ST6vxz7pCTI/AAAAAAAAAHU/0Fmv5zexEn0/s320/why_you_cant_have_custom_monitors.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5277849083613612338" /&gt;&lt;/a&gt;There is a ProgressProvider class that has an abstract &lt;span class="Apple-style-span" style="font-style: italic;"&gt;createMonitor(Job job)&lt;/span&gt; method that is called in the JobManager when creating jobs. The standard concrete implementing class is the ProgressManager, shown above in a dashed box labelled "Discouraged Access". So you can't just override this class.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The JobMonitor is an inner class of ProgressManager, and it is not possible to provide an alternative. Once it has been created and passed to the Job, it has already been displayed as a Dialog, so it can't be wrapped with another Dialog.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Seems impossible, really.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-1954192222492491572?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/1954192222492491572/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=1954192222492491572' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/1954192222492491572'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/1954192222492491572'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2008/12/why-you-cant-make-custom-progress.html' title='Why you can&apos;t make custom progress monitors in eclipse'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_eL4nhrOF5R4/ST6vxz7pCTI/AAAAAAAAAHU/0Fmv5zexEn0/s72-c/why_you_cant_have_custom_monitors.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-2393001674825526621</id><published>2008-12-07T17:56:00.003Z</published><updated>2008-12-07T17:59:30.694Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='JCP'/><title type='text'>Intermediate vs direct rendering</title><content type='html'>Just a small, informal diagram of how the new rendering system works, to balance out all the speculative UML sketches I made...&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/STwOwXwdhhI/AAAAAAAAAHM/V4A4m_zvNw8/s1600-h/new_vs_original.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 176px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/STwOwXwdhhI/AAAAAAAAAHM/V4A4m_zvNw8/s320/new_vs_original.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5277109087545165330" /&gt;&lt;/a&gt;I have not shown the elements inside the element groups, but they are things like line elements, oval elements, and so on.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-2393001674825526621?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/2393001674825526621/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=2393001674825526621' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2393001674825526621'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/2393001674825526621'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2008/12/intermediate-vs-direct-rendering.html' title='Intermediate vs direct rendering'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_eL4nhrOF5R4/STwOwXwdhhI/AAAAAAAAAHM/V4A4m_zvNw8/s72-c/new_vs_original.png' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-3493127994733911542</id><published>2008-12-05T14:24:00.004Z</published><updated>2008-12-05T14:51:49.955Z</updated><title type='text'>Creating new molecule editing programs with cdk</title><content type='html'>After all the work that has gone into making a flexible core for JChemPaint (the controller and renderer modules) I finally have the opportunity to use it for what I wanted : a small custom program for editing molecules from scratch and displaying the spectrum as you go.&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/STk-8FGNWvI/AAAAAAAAAG8/iDMnr5U78Bc/s1600-h/spectral_1.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 160px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/STk-8FGNWvI/AAAAAAAAAG8/iDMnr5U78Bc/s320/spectral_1.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5276317640322538226" /&gt;&lt;/a&gt;This may not look like much, but compare to this next screenshot:&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/STk_iScMxiI/AAAAAAAAAHE/RFU_9TgFx7k/s1600-h/spectral_2.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 160px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/STk_iScMxiI/AAAAAAAAAHE/RFU_9TgFx7k/s320/spectral_2.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5276318296739464738" /&gt;&lt;/a&gt;&lt;br /&gt;Obviously this is a fairly unlikely molecule, but (if you believe me :), the new ring has caused the program to re-predict a spectrum (using NMRShiftDb). The point of this was to teach me more about the connection between molecules and spectra. &lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-3493127994733911542?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/3493127994733911542/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=3493127994733911542' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/3493127994733911542'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/3493127994733911542'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2008/12/creating-new-molecule-editing-programs.html' title='Creating new molecule editing programs with cdk'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_eL4nhrOF5R4/STk-8FGNWvI/AAAAAAAAAG8/iDMnr5U78Bc/s72-c/spectral_1.png' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-1001405204808367591</id><published>2008-11-25T17:51:00.003Z</published><updated>2008-11-25T18:05:08.756Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='cdk'/><title type='text'>CDK-City</title><content type='html'>On the mean streets of CDK-city, one massive tower block dominates the skyline in the renderer district - the Renderer2DModel tower...&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_eL4nhrOF5R4/SSw7k4UGfjI/AAAAAAAAAG0/UmVELAel2xY/s1600-h/cdk_city.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 284px;" src="http://4.bp.blogspot.com/_eL4nhrOF5R4/SSw7k4UGfjI/AAAAAAAAAG0/UmVELAel2xY/s320/cdk_city.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5272654768522034738" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;Masak pointed me to &lt;a href="http://www.inf.unisi.ch/phd/wettel/codecity.html"&gt;this&lt;/a&gt; site due to Richard Wettel which is about a static-analysis tool for making so called 'code cities'. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Basically, the parking lots (like SMARTSParserConstants) have lots of attributes (in fact, this class &lt;span class="Apple-style-span" style="font-style: italic;"&gt;only&lt;/span&gt; has attributes) while skyscrapers (like in the debug package) have many more methods than attributes (most of the debug classes have no attributes except inherited ones).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There's an eclipse plugin from &lt;a href="http://moose.unibe.ch/tools/moosebrewer?_s=seHwwKGGuLpieQnG&amp;amp;_k=WXPOQcPJ&amp;amp;_n&amp;amp;47"&gt;here&lt;/a&gt; called MooseBrewer, and I think there are other tools there I haven't explored. It's great to have a map to this busy code metropolis, as I wondered what my workplace looked like :)&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-1001405204808367591?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/1001405204808367591/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=1001405204808367591' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/1001405204808367591'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/1001405204808367591'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2008/11/cdk-city.html' title='CDK-City'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_eL4nhrOF5R4/SSw7k4UGfjI/AAAAAAAAAG0/UmVELAel2xY/s72-c/cdk_city.png' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-7474638364861417138</id><published>2008-11-23T15:33:00.004Z</published><updated>2008-11-23T15:54:03.398Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='JCP'/><title type='text'>chalky</title><content type='html'>I wanted to show something that hints at the things that the new architecture can afford us:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/SSl4KBV5oSI/AAAAAAAAAGk/WX5CWUJwpHo/s1600-h/chalky.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 303px; height: 320px;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/SSl4KBV5oSI/AAAAAAAAAGk/WX5CWUJwpHo/s320/chalky.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5271876952368193826" /&gt;&lt;/a&gt;This is using a Java2D graphics Paint object to make it look like chalk...kindof. It's a very simplistic way of doing it by making a small image with a random number of white, gray, lightgray, and black pixels.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;edit: it doesn't look so good at small scales&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/SSl8YsDV8kI/AAAAAAAAAGs/qLIKqj5bxKA/s1600-h/chalky_small.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 193px; height: 185px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/SSl8YsDV8kI/AAAAAAAAAGs/qLIKqj5bxKA/s320/chalky_small.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5271881602397762114" /&gt;&lt;/a&gt;&lt;/div&gt;some tweaking of stroke widths and so on is essential.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-7474638364861417138?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/7474638364861417138/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=7474638364861417138' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/7474638364861417138'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/7474638364861417138'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2008/11/chalky.html' title='chalky'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_eL4nhrOF5R4/SSl4KBV5oSI/AAAAAAAAAGk/WX5CWUJwpHo/s72-c/chalky.png' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-7783082152286941716</id><published>2008-11-20T23:27:00.005Z</published><updated>2008-11-20T23:36:11.393Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='JCP'/><title type='text'>Seeing Double?</title><content type='html'>Comes from too much screen-time:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_eL4nhrOF5R4/SSXzJkxWJaI/AAAAAAAAAGc/9cXD9ddcYTc/s1600-h/seeing_double.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 189px;" src="http://4.bp.blogspot.com/_eL4nhrOF5R4/SSXzJkxWJaI/AAAAAAAAAGc/9cXD9ddcYTc/s320/seeing_double.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5270886284722054562" /&gt;&lt;/a&gt;although refactoring often seems like running on the spot (you get nowhere fast), things happen behind the scenes..&lt;br /&gt;&lt;a href="http://gist.github.com/27276"&gt;A short code snippet.&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-7783082152286941716?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/7783082152286941716/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=7783082152286941716' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/7783082152286941716'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/7783082152286941716'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2008/11/seeing-double.html' title='Seeing Double?'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_eL4nhrOF5R4/SSXzJkxWJaI/AAAAAAAAAGc/9cXD9ddcYTc/s72-c/seeing_double.png' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-4093714771754854986</id><published>2008-11-08T10:59:00.004Z</published><updated>2008-11-08T11:11:46.779Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='annealing'/><category scheme='http://www.blogger.com/atom/ns#' term='Seneca'/><title type='text'>More Annealing</title><content type='html'>Some more detail here:&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_eL4nhrOF5R4/SRVxkqn2QgI/AAAAAAAAAGM/YQaKa2xYmIc/s1600-h/cembrane_anneal_2000.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 285px;" src="http://4.bp.blogspot.com/_eL4nhrOF5R4/SRVxkqn2QgI/AAAAAAAAAGM/YQaKa2xYmIc/s320/cembrane_anneal_2000.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5266240214010315266" /&gt;&lt;/a&gt;Here is a larger molecule, cembrane (&lt;a href="http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=15702&amp;amp;loc=ec_rcs"&gt;pubchem link&lt;/a&gt;) which has 20 carbons. The run is longer, too, with 2,000 steps. It finds a spectrum with 100% match within 300 steps, in this case.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;An even larger example is lanosterol (&lt;a href="http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=246983&amp;amp;loc=ec_rcs"&gt;pubchem link&lt;/a&gt;) which has 30 carbons. This actually seems to be too large for C-NMR prediction using NMRShiftDB. It doesn't get the answer within 2,000 steps, and does not look like it is on course to do so:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/SRVzYSCepeI/AAAAAAAAAGU/Ec-f6gnyBL8/s1600-h/lanosterol_anneal_2000.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 285px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/SRVzYSCepeI/AAAAAAAAAGU/Ec-f6gnyBL8/s320/lanosterol_anneal_2000.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5266242200275953122" /&gt;&lt;/a&gt;The highlighted step (1928) is shown as a molecule in the box marked "final", but the score graph has levelled off by about the 400th step.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-4093714771754854986?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/4093714771754854986/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=4093714771754854986' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/4093714771754854986'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/4093714771754854986'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2008/11/more-annealing.html' title='More Annealing'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_eL4nhrOF5R4/SRVxkqn2QgI/AAAAAAAAAGM/YQaKa2xYmIc/s72-c/cembrane_anneal_2000.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-622674252865245628</id><published>2008-11-07T14:09:00.003Z</published><updated>2008-11-07T14:14:11.761Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='Seneca'/><title type='text'>Adaptive Annealing Engine Test Screenshot</title><content type='html'>&lt;div&gt;So, finally, some work that I am meant to be doing (click for bigger, as usual):&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/SRRMhS-3brI/AAAAAAAAAGE/wFSkomVYxig/s1600-h/annealing_screenshot.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 282px;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/SRRMhS-3brI/AAAAAAAAAGE/wFSkomVYxig/s320/annealing_screenshot.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5265917999217864370" /&gt;&lt;/a&gt;It's crude, but it is starting to work. The screenshot shows only the first 100 steps of a run, but clearly you can get the right answer (almost by chance, actually) for such a small molecule when you are essentially just permuting the atoms.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Now to test more fully...&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-622674252865245628?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/622674252865245628/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=622674252865245628' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/622674252865245628'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/622674252865245628'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2008/11/adaptive-annealing-engine-test.html' title='Adaptive Annealing Engine Test Screenshot'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_eL4nhrOF5R4/SRRMhS-3brI/AAAAAAAAAGE/wFSkomVYxig/s72-c/annealing_screenshot.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-4117556173725319198</id><published>2008-10-23T23:10:00.004+01:00</published><updated>2008-10-23T23:15:50.636+01:00</updated><title type='text'>Shadows</title><content type='html'>Not the most important use of rendering code..&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_eL4nhrOF5R4/SQD3VhJPbKI/AAAAAAAAAF8/09HY_yVFZc8/s1600-h/drop_molecule.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 276px; height: 270px;" src="http://4.bp.blogspot.com/_eL4nhrOF5R4/SQD3VhJPbKI/AAAAAAAAAF8/09HY_yVFZc8/s320/drop_molecule.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5260476313815182498" /&gt;&lt;/a&gt;&lt;br /&gt;:)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-4117556173725319198?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/4117556173725319198/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=4117556173725319198' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/4117556173725319198'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/4117556173725319198'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2008/10/shadows.html' title='Shadows'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_eL4nhrOF5R4/SQD3VhJPbKI/AAAAAAAAAF8/09HY_yVFZc8/s72-c/drop_molecule.png' height='72' width='72'/><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-3707633093432784752</id><published>2008-10-20T06:39:00.004+01:00</published><updated>2008-10-20T07:09:58.608+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='JCP'/><title type='text'>Symbols and Elements</title><content type='html'>I have made attempts over the years to draw triangles and circles like, for example, this:&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/SPwa6w0xlcI/AAAAAAAAAFs/AQJoX1-aeHI/s1600-h/1qgk.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/SPwa6w0xlcI/AAAAAAAAAFs/AQJoX1-aeHI/s320/1qgk.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5259108061702886850" /&gt;&lt;/a&gt;(in fact, this was drawn by David Westhead's code). There is a small design decision to be made between representing drawing elements as basic geometry or as &lt;span class="Apple-style-span" style="font-style: italic;"&gt;symbols&lt;/span&gt;. A symbol may be the same as a geometric element (like a circle), but it might be a combination of them. Consider the 'N' and 'C' in boxes, above. Or this image:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/SPwdz5bBYCI/AAAAAAAAAF0/av8svbKl2xc/s1600-h/elements_vs_symbols.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/SPwdz5bBYCI/AAAAAAAAAF0/av8svbKl2xc/s320/elements_vs_symbols.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5259111242286587938" /&gt;&lt;/a&gt;where the mass number and charge are separate text elements on the left, and part of the whole symbol on the right. Neither is a 'better' way of doing things; they each have their advantages - and disadvantages.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It can be a lot clearer to use symbols, as each model object (atom, bond, helix, gene, cell) has a corresponding representation, and then a diagram composes the symbols into a manageable whole. On the other hand, you can re-use elements in different combinations for diagrams. A general vector drawing package would have Line, Text, Rectangle, Triangle, and so on.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So my vote would be to use symbols for the CDK/JCP renderer. The JCP application is not intended to be a full vector drawing package, but a specialised chemical editor.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-3707633093432784752?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/3707633093432784752/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=3707633093432784752' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/3707633093432784752'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/3707633093432784752'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2008/10/symbols-and-elements.html' title='Symbols and Elements'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_eL4nhrOF5R4/SPwa6w0xlcI/AAAAAAAAAFs/AQJoX1-aeHI/s72-c/1qgk.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-8668229466650116631</id><published>2008-10-17T16:43:00.005+01:00</published><updated>2008-10-17T17:15:27.366+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='JCP'/><title type='text'>The Trouble with Tribble Visitors</title><content type='html'>So, I'm now partly sold on the power of a Visitor approach to rendering. Consider this snippet:&lt;div&gt;&lt;/div&gt;&lt;pre&gt;&lt;br /&gt;          if (clicked) diagram.accept(new DropShadowVisitor(g, 5, -5);&lt;br /&gt;          else         diagram.accept(new DrawVisitor(g));&lt;br /&gt;&lt;/pre&gt;&lt;div&gt;What this is doing is drawing a diagram normally unless the mouse is clicked, when it draws with a drop shadow. I see that the beauty of this is the ability to manipulate functionality as a block (just as in languages where you can pass around functions...).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;However, I should point out that the approach has its tricky rapids as well as such smooth sailing. The image below is a spot-the-difference (click for bigger):&lt;/div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/SPi0u7pVgvI/AAAAAAAAAFk/3N2Rphv37t8/s1600-h/drop_shadows_visitor.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/SPi0u7pVgvI/AAAAAAAAAFk/3N2Rphv37t8/s320/drop_shadows_visitor.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5258151283333104370" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;div&gt;On the left is the version drawn by a naive first try at the drop visitor. Its methods look like this, the visitText(Text text) method:&lt;/div&gt;&lt;br /&gt;&lt;pre&gt;    g.setColor(Color.LIGHT_GRAY);&lt;br /&gt;   g.drawString(text.text, text.x, text.y);&lt;br /&gt;   g.setColor(Color.BLACK);&lt;br /&gt;   g.drawString(text.text, text.x + dx, text.y + dy);&lt;br /&gt;&lt;/pre&gt;&lt;div&gt;The problem with this code is subtle - the elements are visited once each, and the shadow is rendered at the same time as the element. The change that had to be made was to have a boolean "drawingShadow" and to visit the elements &lt;span class="Apple-style-span" style="font-style: italic;"&gt;twice&lt;/span&gt;&lt;/div&gt;&lt;pre&gt;        this.drawingShadow = true;&lt;br /&gt;   for (DiagramElement element : diagram.children) { element.accept(this); }&lt;br /&gt;   this.drawingShadow = false;&lt;br /&gt;   for (DiagramElement element : diagram.children) { element.accept(this); }&lt;br /&gt;&lt;/pre&gt;&lt;div&gt;The shadow has to be drawn first  - BUT first for the whole diagram, not just first for each element. So the new text method is :&lt;br /&gt;&lt;/div&gt;&lt;pre&gt;&lt;br /&gt;        if (this.drawingShadow) {&lt;br /&gt;   g.setColor(Color.LIGHT_GRAY);&lt;br /&gt;   g.drawString(text.text, text.x, text.y);&lt;br /&gt;  } else {&lt;br /&gt;   g.setColor(Color.BLACK);&lt;br /&gt;   g.drawString(text.text, text.x + dx, text.y + dy);&lt;br /&gt;  }&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-8668229466650116631?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/8668229466650116631/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=8668229466650116631' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/8668229466650116631'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/8668229466650116631'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2008/10/trouble-with-tribble-visitors.html' title='The Trouble with Tribble Visitors'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_eL4nhrOF5R4/SPi0u7pVgvI/AAAAAAAAAFk/3N2Rphv37t8/s72-c/drop_shadows_visitor.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-7018993421458969699</id><published>2008-10-16T17:10:00.005+01:00</published><updated>2008-10-16T19:28:06.231+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='UML'/><category scheme='http://www.blogger.com/atom/ns#' term='JCP'/><title type='text'>Arvid's Renderer Design</title><content type='html'>This is a sketch of my understanding of Arvid's renderer design:&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/SPdpPQwDj7I/AAAAAAAAAFc/fCDGUNUuX_o/s1600-h/arvid_design.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/SPdpPQwDj7I/AAAAAAAAAFc/fCDGUNUuX_o/s320/arvid_design.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5257786800893824946" /&gt;&lt;/a&gt;&lt;br /&gt;I say 'my understanding' with good reason - a diagram can be biased, even a formal(ish) one like this, so I don't claim that this is definitive!&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It is interesting, anyway, as it seems to be a combination of a Composite and a Visitor pattern. The RenderingModel implements IRenderingElement (and Iterable of IRenderingElements), which is Composite.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The Modules are Elements in the Visitor pattern, and the Elements are Visitors.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;edit: What nonsense I talk! The IRenderingElements are Elements and the  IRenderingModules are Visitors. That's better.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-7018993421458969699?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/7018993421458969699/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=7018993421458969699' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/7018993421458969699'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/7018993421458969699'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2008/10/arvids-renderer-design.html' title='Arvid&apos;s Renderer Design'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_eL4nhrOF5R4/SPdpPQwDj7I/AAAAAAAAAFc/fCDGUNUuX_o/s72-c/arvid_design.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-1052812695178743613</id><published>2008-10-15T23:09:00.004+01:00</published><updated>2008-10-15T23:56:26.286+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='JCP'/><title type='text'>Font Management and GlyphVectors</title><content type='html'>Fonts and text are always a pain when drawing vector graphics!&lt;div&gt;&lt;br /&gt;&lt;/div&gt;After hitting myself over the head over a stupid bug in my zoom tests (like this:)&lt;blockquote&gt;JButton inButton = new JButton("IN");&lt;br /&gt;inButton.setActionCommand("IN");&lt;br /&gt;JButton outButton = new JButton("OUT");&lt;br /&gt;inButton.setActionCommand("OUT");&lt;/blockquote&gt;I got center scaling implemented in my branch. Which lets me test a different approach to managing fonts. I had tried to be clever and do something like:&lt;div&gt;&lt;blockquote&gt;GlyphVector glyphs = font.createGlyphVector("N");&lt;br /&gt;ArrayList&lt;shape&gt; shapes = new ArrayList&lt;shape&gt;();&lt;br /&gt;for (int i = 0; i &amp;lt; glyphs.getNumberOfGlyphs(); i++) {&lt;br /&gt; shapes.add(affineTransform.getTransformedShape(glyphs.get(i)));&lt;br /&gt;}&lt;br /&gt;&lt;/shape&gt;&lt;/shape&gt;&lt;/blockquote&gt;but letter shapes fill horribly with lots of missing pixels. Anyway, I eventually settled on just storing the Glyphs themselves. Less pure, but it works, and it allows the size of TextSymbols to be computed and used by the class in between drawing.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So, for fonts there had to be a way in my architecture to change the font size when the whole molecule is scaled. The nicer approach is the obvious one - just to transform the Graphics object. However, it turns out there are advantages to the more cumbersome approach. Here's an image:&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/SPZ0Bqp7pdI/AAAAAAAAAFM/L-fdTiKnQQ0/s1600-h/font_sizes.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/SPZ0Bqp7pdI/AAAAAAAAAFM/L-fdTiKnQQ0/s320/font_sizes.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5257517186980488658" /&gt;&lt;/a&gt;which shows rings at various sizes, with the numbers at their centers showing the font size at that scale. An important one is the 'size' (9-5) - which is 4, but there is no readable size of font at that size. However, the FontManager class keeps track of how far below (or above) the minimum and maximum font sizes, and then returns to that size at the appropriate point.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-1052812695178743613?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/1052812695178743613/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=1052812695178743613' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/1052812695178743613'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/1052812695178743613'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2008/10/font-management-and-glyphvectors.html' title='Font Management and GlyphVectors'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_eL4nhrOF5R4/SPZ0Bqp7pdI/AAAAAAAAAFM/L-fdTiKnQQ0/s72-c/font_sizes.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-4586148551439703373</id><published>2008-10-14T15:47:00.005+01:00</published><updated>2008-10-14T15:51:36.169+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='cdk'/><title type='text'>A correction, and a simpler example</title><content type='html'>There was of course a deliberate mistake in the previous post:&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eL4nhrOF5R4/SPSxOq2mdoI/AAAAAAAAAE8/Fp1xuBmKUm4/s1600-h/canon_pvr_mistake.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://3.bp.blogspot.com/_eL4nhrOF5R4/SPSxOq2mdoI/AAAAAAAAAE8/Fp1xuBmKUm4/s320/canon_pvr_mistake.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5257021530627798658" /&gt;&lt;/a&gt;as the highlighting shows, the matrix for cubane numbered PVR style is neatly divided. A simpler example is cyclobutane:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/SPSxn07ATcI/AAAAAAAAAFE/z-b9h8hyxrA/s1600-h/canon_pvr_cyclobutane.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/SPSxn07ATcI/AAAAAAAAAFE/z-b9h8hyxrA/s320/canon_pvr_cyclobutane.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5257021962827353538" /&gt;&lt;/a&gt;which also shows the important point that, even if the sum of the row-numbers is the same as for other possible numberings, the PVR numbering is still the only ordered one.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-4586148551439703373?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/4586148551439703373/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=4586148551439703373' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/4586148551439703373'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/4586148551439703373'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2008/10/correction-and-simpler-example.html' title='A correction, and a simpler example'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_eL4nhrOF5R4/SPSxOq2mdoI/AAAAAAAAAE8/Fp1xuBmKUm4/s72-c/canon_pvr_mistake.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-5810851722020932695</id><published>2008-10-12T18:18:00.007+01:00</published><updated>2008-10-12T20:17:08.333+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='cdk'/><title type='text'>On Canonical Numberings</title><content type='html'>So, after reading* this (2005) paper : "On Canonical Numbering of Carbon Atoms in Fullerenes : C60 Buckminsterfullerene" (&lt;a href="http://hrcak.srce.hr/index.php?lang=en&amp;amp;show=clanak&amp;amp;id_clanak_jezik=4223"&gt;link&lt;/a&gt;) I made some pictures to illustrate the difference between it and the numbering scheme used for SMILES (as described &lt;a href="http://dblp.uni-trier.de/rec/bibtex/journals/jcisd/WeiningerWW89"&gt;here&lt;/a&gt;). Er, which is used in the CDK.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Anyway, the point is that the scheme used by Plavšić, Vukičević, and Randić (or PVR as I will refer to them, I hope they don't mind!) numbers the atoms in a way that produces an adjacency matrix with a particular property. If you consider the rows of the matrix to be binary numbers, then the set of numbers is the smallest possible. So, for example:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_eL4nhrOF5R4/SPJDs8VSPgI/AAAAAAAAAEs/EUYaJf7iaOE/s1600-h/canon_pvr.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://2.bp.blogspot.com/_eL4nhrOF5R4/SPJDs8VSPgI/AAAAAAAAAEs/EUYaJf7iaOE/s320/canon_pvr.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5256338154483498498" /&gt;&lt;/a&gt;&lt;div&gt;&lt;br /&gt;The structure on the left is cubane, with its adjacency matrix on the right. The column on the far right shows the rows of the matrix in base 10. They are clearly in order. Now what happens for the SMILES? Well:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_eL4nhrOF5R4/SPI5fv8FovI/AAAAAAAAAEk/q1Ot0doAaIU/s1600-h/canon_smiles.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://4.bp.blogspot.com/_eL4nhrOF5R4/SPI5fv8FovI/AAAAAAAAAEk/q1Ot0doAaIU/s320/canon_smiles.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5256326932702012146" /&gt;&lt;/a&gt;Here, the rows are neither in order (I'm not sure from their paper whether the ordering is an expected outcome for all structures, nor have I checked...) nor is their sum less than for PVR scheme - 765 vs 753.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Of course, the PVR labelling would be useless for generating SMILES for cubane since there is no way to get a path from it. Indeed, the labels are designed to be maximally unfriendly by pairing the highest with the lowest.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Furthermore their scheme goes on to label bonds and rings:&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eL4nhrOF5R4/SPJKW2KRd3I/AAAAAAAAAE0/HK4d51GZ9H8/s1600-h/cubane_bonds_rings_PVR.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://1.bp.blogspot.com/_eL4nhrOF5R4/SPJKW2KRd3I/AAAAAAAAAE0/HK4d51GZ9H8/s320/cubane_bonds_rings_PVR.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5256345471450969970" /&gt;&lt;/a&gt;Which also look quite random; or, as they say :&lt;blockquote&gt;"... we admit that the final labels ... do not appear »orderly« but one has to recognise that there is no »simple« labelling in [fullerenes] that will appear simple" &lt;/blockquote&gt;&lt;/div&gt;which makes sense. Obviously, not for cubane here, but for C60/C70 and so on it does.&lt;br /&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;*(probably because the name follows the "On X" paper naming scheme)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-5810851722020932695?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/5810851722020932695/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=5810851722020932695' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/5810851722020932695'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/5810851722020932695'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2008/10/on-canonical-numberings.html' title='On Canonical Numberings'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_eL4nhrOF5R4/SPJDs8VSPgI/AAAAAAAAAEs/EUYaJf7iaOE/s72-c/canon_pvr.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-95492555536520960</id><published>2008-10-10T17:59:00.002+01:00</published><updated>2008-10-10T18:07:56.001+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='cdk'/><title type='text'>KDTree in SymbolTree</title><content type='html'>So, the new Symbol tree base class now has a working implementation of a KDTree. (Or as close as the test can tell). What this means is that, should you ever need to edit (not just display) a very large 2D structure, you would do something like:&lt;br /&gt;&lt;blockquote&gt;SymbolTree tree = createTree();       // get a tree somehow&lt;br /&gt;tree.setHitDistance(minDistance);     // the minimum distance from mouse to symbol&lt;br /&gt;tree.useKDTree(true);                         // essential&lt;br /&gt;tree.highlightClosestSymbol(point);  // will now do a fast search for the closest&lt;/blockquote&gt;Of course, this is really only useful for a) large trees and b) when doing many "closest symbol" operations. For example, highlighting when moving the mouse.&lt;br /&gt;&lt;br /&gt;The best situation would be to dynamically call the "useKDTree" method when the size was above some threshold (100 atoms?). The interface is the same, either way.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-95492555536520960?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/95492555536520960/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=95492555536520960' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/95492555536520960'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/95492555536520960'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2008/10/kdtree-in-symboltree.html' title='KDTree in SymbolTree'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-123313693388384663.post-9128857501885796471</id><published>2008-10-08T12:06:00.003+01:00</published><updated>2008-10-08T12:27:10.491+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='cdk'/><category scheme='http://www.blogger.com/atom/ns#' term='bioclipse'/><title type='text'>Steps to move Bioclipse to the new CDK plugins</title><content type='html'>Thanks to &lt;a href="http://chem-bla-ics.blogspot.com/"&gt;egon&lt;/a&gt; the monolithic cdk plugin is now about 40 small plugins. Here's how to migrate...&lt;br /&gt;&lt;br /&gt;In no particular order:&lt;br /&gt;&lt;br /&gt;Step 1: Delete plugin org.openscience.cdk from the workspace&lt;br /&gt;Step 2: Download the third-party library plugins from bioclipse2/trunk/cdk-externals/&lt;br /&gt;Step 3: Download the (many) new cdk plugins from bioclipse2/trunk/cdk-externals/trunk/&lt;br /&gt;Step 4: Update those plugins that complain about missing org.openscience.cdk dependency&lt;br /&gt;(Step 5: Possibly change the Require-bundle for org.apache.log4j : &lt;a href="http://sourceforge.net/tracker2/?func=detail&amp;amp;aid=2152883&amp;amp;group_id=150681&amp;amp;atid=778609"&gt;as in this bug report&lt;/a&gt;)&lt;br /&gt;&lt;br /&gt;It should now all compile...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/123313693388384663-9128857501885796471?l=gilleain.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://gilleain.blogspot.com/feeds/9128857501885796471/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=123313693388384663&amp;postID=9128857501885796471' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/9128857501885796471'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/123313693388384663/posts/default/9128857501885796471'/><link rel='alternate' type='text/html' href='http://gilleain.blogspot.com/2008/10/steps-to-move-bioclipse-to-new-cdk.html' title='Steps to move Bioclipse to the new CDK plugins'/><author><name>gilleain</name><uri>http://www.blogger.com/profile/14491887080861584059</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry></feed>
