Some Stuff

Posts

Tailor : Descriptions

Tailor (https://github.com/gilleain/tailor) is a project that grew out of my attempts to search for catmats niches with Prof. Milner-White. The goal of the project was to allow users to define protein structural patterns (called 'descriptions') along with a set of associated measures . More on measures later, but first what is a description? Here is a very simple example: The lines don't have arrowheads, but this is implicitly a tree/DAG rather than a graph. There is a root ProteinDescription and the leaves are AtomDescriptions - the DistanceCondition is referencing the two atoms. Basically, this just defines a pattern of two amino acids (GLY, ALA) with a distance of less than 3Å between the N and O atoms. There are still a lot of details to be worked out here. Can the groups be separated along the chain? If they can, should that require the description to be explicit as to the relationship between sibling nodes? How do we define any number of mat...

WTF is a Number Bond?

Not chemistry, as it happens. I was searching for similar images to one of my line drawings (always fun) and came across these 'number bond things' : The one on the left - hilariously - is just "1 + 1 = 2". Ok, so that's a deliberately jokey example; real ones have larger numbers and one of the three numbers is for the student to fill in. On the right is a more complex example, drawn as a DAG (directed acyclic graph) although at least one of the example I saw had a node at the bottom with three parents! In any case, what these things really are representing is partitions of numbers - which are usually drawn as Ferrer's diagrams (or Young tableaux) which I'll refer to as "Ferrer's-Young diagrams". These have a superior feature as shown here: So one FY-diagram can represent two different number bonds. Note that I've made the crazy leap of making number bonds with more than two parts (or 'addends'). Clearly 1+1+3+4 = 9 = ...

Submultisets and Graphs

The previous post mentioned the restricted weak composition (RWC), but didn't expand on it at all! Basically, I found this excellent paper : " Generalized Algorithm for Restricted Weak Composition Generation " by Daniel R. Page. It even gave some java code in an appendix - good stuff :) Anyway, a RWC is a composition (which is a partition where order matters) that is weak - has zeros in it - and the parts are restricted . So [1, 0, 1, 1] is a weak composition of 3 into 4 parts and lets say we have restricted the parts to {0, 1, 2}. Here is an overview of the scheme: where we take a degree sequence, convert to a multiset and use a RWC to get a particular sub-multiset. This allows us to take some count of some subset of elements from the multiset. Doing this for all sub-multisets at each round should then allow us to list graphs - although not without redundant examples: This shows all starting points for 3 -> [3, 2, 2, 1, 1] and ...

Restricted Weak Compositions, Labelled Partitions, and Trees

So in the last post about listing trees I outlined a slightly cumbersome method to list trees from degree sequences. Thinking about it a bit more, it would probably be far easier to just list all trees on some number of vertices and filter out by degree sequence. I talked a little about the WROM algorithm in this old post which is a constant time generator of 'free' (unlabelled) trees. Anyway, that's boring so I was trying out the more complicated approach. It looks like generating a single tree from a degree sequence is as simple as the Havel-Hakimi method. Connect the largest degree (dn) to the dn next largest degrees. Also maintain a list of vertices that have already been connected to, and then at the next step connect only to those not already connected to. So, for [3, 3, 2, 1, 1, 1, 1] we get: You might notice that trees a) and c) are isomorphic. Below the trees labelled by degree are the same trees labelled by DFS discovery order, and below that th...

Listing Degree Restricted Trees

Although stack overflow is generally just an endless source of questions on the lines of "HALP plz give CODES!? ... NOT homeWORK!! - don't close :(" occasionally you get more interesting ones. For example this one that asks about degree-restricted trees. Also there's some stuff about vertex labelling, but I think I've slightly missed something there. In any case, lets look at the simpler problem : listing non-isomorphic trees with max degree 3. It's a nice small example of a general approach that I've been thinking about. The idea is to: Given N vertices, partition 2(N - 1) into N parts of at most 3 -> D = {d0, d1, ... } For each d_i in D, connect the degrees in all possible ways that make trees. Filter out duplicates within each set generated by some d_i. Hmm. Sure would be nice to have maths formatting on blogger.... Anyway, look at this example for partitioning 12 into 7 parts: At the top are the partitions, in the middle the trees...

Equitable Partition Refinement with List Invariants

So the bug with C19H14 and C10H16 formulae seems to have been due to the partition refinement not correctly labelling structures with particular arrangements of multiple bonds. The underlying problem is in the equitable partition refinement process. This is a short note about the problem. Equitable refinement of a partition for a graph is the formation of a vertex partition where each element of each block of the partition has an equal number of neighbours in the other blocks. This is a little difficult to imagine, but it is - roughly - a generalisation of the Morgan number algorithm which attempts to find labels for sets of vertices which are stable with respect to splitting them by the labels of the neighbours. For example: This image shows a cub-2-ene like molecule (or a cube graph with two of the edges colored). Clearly the orange and green vertices are in 'different' sets in some sense. Precisely, they are in different blocks of the equitable partition ([0,2,5,7|1...