So, after chatting some stuff about the PDBReader in CDK versus the Biojava PDBReader on CDK-devel (devel archive) I realised that I don't actually know how these different frameworks model biopolymers.
This is my attempt to understand them:
it shows a partial hierarchy for each framework, without the detail of which are classes, and which are interfaces. Notably, biojava has an interface and an implementing class for each of the things shown.
One thing I should point out, is that "Strand" in CDK should really (really) be called "Chain" as in Biojava and Jmol (and probably every other known framework). A kind of spelling mistake, I think. I also don't understand what the PhosporousMonomer and PhosporousPolymer are in Jmol.
This is my attempt to understand them:
it shows a partial hierarchy for each framework, without the detail of which are classes, and which are interfaces. Notably, biojava has an interface and an implementing class for each of the things shown.
One thing I should point out, is that "Strand" in CDK should really (really) be called "Chain" as in Biojava and Jmol (and probably every other known framework). A kind of spelling mistake, I think. I also don't understand what the PhosporousMonomer and PhosporousPolymer are in Jmol.
Comments
just to quickly add, BioJava can also represent multiple models within PDB files.
There are different iterators and utiliy classes that allow to access
arrays of atoms or iterate over all amino acids, do calculations etc. It can deal with ATOM vs SEQRES issues.
Since all objects in the datamodel are JavaBeans, it is also possible to serialize a structure back to a database using Hibernate.
For more docu on what you can do with the BioJava structure modules
see here:
http://biojava.org/wiki/BioJava:CookBook#Protein_Structure
Andreas
The ATOM<->SEQRES stuff is also helpful.
I'm not so fussed about the hibernate aspect. I suppose it could be good, but I don't think that should be the driver behind a design.
I guess that there's more than one way to skin a protein...