• Keine Ergebnisse gefunden

4.6 Perl Molecule - A Class Library for Preparing Ligand Binding Studies

4.6.3 Class Hierarchy and Ontology of Perl Molecule

Currently,Perl Moleculeconsists of 34 classes and about 10000 lines of code. So it is signifi-cantly more complex thanQMPBwith 6 classes and about 2000 lines of code. This complexity naturally sets a limit to the detail at which the classes can be described here. I review the classes and discuss the most important methods a bit more in detail.

Figure 4.19. The composition hierarchy diagram shows the taxonomic dependence of classes in the ontology ofPerl Molecule. The arrows with lozenge point towards the aggrega-tion class and the other end to the components. The numbers give the order of associaaggrega-tion.

The lines indicate important associations, which are no compositions. The classes are color coded in this section.

Perl Moleculeimplements an ontology2,i.e., a number of concepts are described by classes.

Already in the introduction to this section it was pointed out, that biomolecular complexes (here just called molecules) can be seen in a hierarchical way. A molecule is described by a set of conformers, each conformer consists of chains, each chain consists of residues and each residue consists of atoms. Such a relationship of complex objects can be modeled by component objects, where each object is usually part of another object (composition). Many operations (methods) act with a cascaded semantic, i.e., deleting a residue deletes also all atoms, which are part of the residue.

2An ontology is “A systematic arrangement of all of the important categories of objects or concepts which exist in some field of discourse, showing the relations between them. When complete, an ontology is a categorization of all of the concepts in some field of knowledge, including the objects and all of the properties, relations, and functions needed to define the objects and specify their actions. A simplified ontology may contain only a hierarchical classification (a taxonomy) showing the type subsumption relations between concepts in the field of discourse. An ontology may be visualized as an abstract graph with nodes and labeled arcs representing the objects and relations.

Note: The concepts included in an ontology and the hierarchical ordering will be to a certain extent arbitrary, depending upon the purpose for which the ontology is created. This arises from the fact that objects are of varying importance for different purposes, and different properties of objects may be chosen as the criteria by which objects are classified. In addition, different degrees of aggregation of concepts may be used, and distinctions of importance for one purpose may be of no concern for a different purpose.” (From The Collaborative International Dictionary of English v.0.48)

The Container Class

To realize a composite pattern [135], a general abstract classcontainer(Fig. 4.20) was writ-ten. It implements the methods to manage a set ofcontainerobjects it stores. Each container object has one reference to the superior3 containerobject it is part of. The methodsaddand delete call the constructor and destructor internally, but also call methods of the superior object to add or delete the current object. There are methods for moving the object from one container to another or to join the components of two containers into one. The members are generally referred to by a name (label), but unlike hashes inPerl, the containerclass keeps track of their order. By default, the order is the order in which the components were added, but there are also methods for sorting the components. There are various accessor methods for the components (e.g., by label, by index in the current order or by index number specific for a particular object type). There is an foreachiterator method, which allows operations, e.g.,saying “foreach atom in chain A...” instead of having a method in thechainclass, which iterates over all residues and all atoms in each residue.

Thecontainerclass is general and can be used for many applications. For example it would make sense to derive the classsite masterin QMPBfrom this class, because a major func-tion of this class is to managesiteobjects. (This requires, that also thesiteclass is derived from the container class, even it contains no further objects as it also is the case e.g., for coordinateandchargeclasses inPerl Molecule).

The MMcontainer Class

From the container class the MMcontainer class is derived (Fig. 4.20), which adds some methods usefull to all molecular modeling child classes. For example, the method printsta-tisticsis implemented here, which iterates over the object composition to gain information as how many atoms are in a residue and how many residues with how many atoms in total are in a chain etc. It also provides a general write method, which allows writing all atoms contained (indirectly) in an object to a file in various pdb or pqr formats.

The Molecule Class

The root of the ontological tree ofPerl Moleculeis themoleculeclass (Fig. 4.21). It describes a biomolecular complex by a number of discrete conformers. InPerl Molecule, it provides an interface to the objects it is build up from by cascading or delegating method calls. A number of methods were already described in the previous section. In the next section, some more advanced operations are shown, which require to call methods of other classes as well. From the 47 methods of this class only a few can be discussed briefly: The methodsread pdb and read pqr aim to be very flexible reading routines for the two file formats. They should be able to handle many of the various versions of the file formats different programs produce.

Internally, a hash of the data is generated from eachATOM orHETATMline and passed to the method setup (defined in the MMcontainer class), which generates appropriate objects at each level of thePerl Moleculeontological hierarchy. The methodswrite pdbandwrite pqr

3In the code, the instance variableinferior is used, looking from the leaves down to the roots of the tree.

Consistently, the abstract methodsuperior typereturns the classname of the inferior class for all derived classes.

It seems that the biology oriented study left its traces...

Figure 4.20. From the abstractcontainerclass most Perl Molecule classes are derived.

The Perl module Clone is used by the copy method. The abstract MMcontainer class is derived from thecontainerclass.

Figure 4.21. Themoleculeclass is derived from theMMcontainerclass. It contains objects of theconformerclass. Classes which are imported are linked by an arrow originating at the class name.

call the method write for each atom with parameters appropriate for the two file formats.

The methodadd hydrogenincludes the code for splitting the molecule into parts, which can be processed by Hwire and joining the parts after the run of the external program into a single molecular structure. The method add radii constructs an object of the classbondi to read radii from file and then iterates over all atoms by calling their method set radius with the methodget elementof thebondiobject as parameter. The methodread rotamers creates rotamer library objects of thedunbrackclass. Using this class, the backbone and non-backbone dependent rotamer library of Dunbrack [101, 102] can be read and used to generate sidechain rotamers. The method setup topology is called by the methods setup graph and setup charge groups to construct an object from the topologyclass, which contains a method of the pattern read charmm or read amber for each supported force field topology format.

The Conformer Class

Theconformerclass describes a discrete global conformation of the molecule (Fig. 4.22). In the generalized ligand binding theory, all instances of all sites have to be recomputed for each conformer. In contrast, for rotamers only additional instances need to be calculated, which requires a significantly lower computational effort. However, due to the approximations made for rotamers, they are only valid for small local structural changes.

Each conformer can contain components of different classes. For the molecular description it contains one or more objects of the chain class. The ligand binding sites of the molecule are stored in objects of the site class in the conformer object. Coordinate and charge sets are stored in objects of the coordinate set and charge set class and charge groups are stored in objects of the charge group class. Since a site, coordinate set, charge set or charge group can extend multiple chains, but is always associated with a particular con-former, the branching of the ontological graph at this point is reasonable. Most functionality of the conformer class is inherited from the container and MMcontainer classes. It pro-vides some methods for cascading calls to components. The most important method maybe is conformer::write qmpb4, which is called by the methodmolecule::write qmpb. It checks the elements of the %qmpbhash for completeness and writes them into the general section of aQMPBinput file in a subdirectory with the conformer name. For each instance of each site instance::write qmpbis called.

The Chain Class

Thechain class represents a particular polymer chain in a certain conformation (Fig. 4.22).

It only contains objects of the residue class. All its functionality is inherited from the containerandMMcontainerclasses.

The Residue Class

The residue class represents a particular residue in a polymer chain (Fig. 4.22). It only contains objects of the atomclass. Most of its functionality is inherited from the container

4I will use the shorthand notationclass::methodto refer to a method of a particular class. If no class is specified, I refer to the class discussed in the current paragraph.

Figure 4.22. Theconformer,chainandresidueclass are derived from theMMcontainer class. The three class contain objects of thechain,residueandatom class, respectively, and other classes. Classes which are imported are linked by an arrow originating at the class name.

and MMcontainerclasses. The residue class has a number of methods dealing with dihe-dral angles and rotamers of the side chain: The methoddihedral by atomlabelscalculates the dihedral angle between four atoms of this residue. If the atom has multiple coordinates, only the first coordinate is taken into account. The label “-C” refers to the carbonyl carbon of the previous residue, which is resolved by the topology graph. The dihedral is computed by the method vector::get torsion. The method gen sidechain generates recursively a hash of sidechain atom names as keys and references to the objects as values. The method starts at a given atom and includes all atoms into the hash, which are bound to this atom unless they are either in a hash of excluded atoms or already in the hash of found sidechain atoms. For example, starting at the atom “CB” and excluding the atom “CA”, the method would find the sidechain of each amino acid except glycine and proline based on topological information. The methodset dihedral by atomlabelschanges the dihedral angle between two atoms given by their atom labels. The method takes a hash of sidechain atoms as gen-erated by gen sidechain and the target dihedral angle as further parameters. First, the rotation matrix is calculated byvector::get rotation matrix. Then for each atom in the hash of sidechain atoms the torsion matrix is applied usingvector::my change tor matrix.

The methodgenerate rotamersis called by the method of same name of themoleculeclass.

The method allows to generate all rotamers in a rotamer library using the methods discussed for this class.

The Atom Class

Theatomclass represents a particular atom in a residue (Fig. 4.23). It contains objects of the coordinateandchargeclass. The atom radius is assumed to be fixed for a given atom and not variable in different instances. Some of its functionality is inherited from the container and MMcontainer classes, but also a substantial amount of functionality is added by over thirty additional methods. The setup method is overwritten, because a number of instance variables are set, an object of thecoordinateclass generated for each coordinate of the atom and an object of thechargeclass generated (viatheadd chargemethod) for each charge. The writemethod is called by the methodsmolecule::write pdbandmolecule::write pqras well as by many other methods which need to write parts of the structure in pdb or pqr format. The write method itself is mainly concerned about printing a line for each in-stance the atom is associated with. The actual formated printing is done by the method write pdb. The method setup bonded creates a hash of atoms bonded to this atom. This information is used by methods iterating through the topology of the molecule. The method setup charge groups, called by the methodmolecule::setup charge groups, constructs a new object, if it does not yet exist, of thechargegroupclass as component of the conformer object to which the atomobject belongs. It calls the method chargegroup::setup members for the charge group defined in the topology object and adds the current atom by the method chargegroup::add atom. The method atom::setup sites is called by the method molecule::setup sitesto create objects of the site class from rotamer forms and charge forms. A new object of the site class is constructed, if the site of the atom is undefined, but a charge set or coordinate set is defined for the atom. Each atom contained in the charge set or coordinate set is added to the new site by the method site::add atom. If an atom already belongs to another site, the two sites are joined to a larger site. Fi-nally, the method site::setup is called. The setup rotamers method adds the atom to a

Figure 4.23. Theatomclass is derived from theMMcontainerclass. It contains objects of thecoordinateandchargeclass. Classes which are imported are linked by an arrow orig-inating at the class name. Anatomobject uses the attributessite,charge set,coord set and chargegroup to store references to the objects of the respective classes to which it belongs.

Figure 4.24. The coordinateand charge class are derived from the container class.

Classes which are imported are linked by an arrow originating at the class name. The most important attributes are for thecoordinateclass the three cartesian coordinatesx,yand zand for thechargeclass the chargeq.

coordinate setand itscoordinateobjects to arotamer formobject. If aatomdoes not yet belong to a coordinate set object, but has more than one object of the coordinate class as component, the method checks if any bound atom or atom bound to a bound atom (angle relationship) already belongs to a coordinate set. If so, theatomis added by theadd atom method to the coordinate set. If nocoordinate set is found, a newcoordinate set ob-ject is constructed as component of the conformer object to which the atom belongs. The atomis addedviaadd atomas before. For each coordinateobject of the atomit is checked if a rotamer form with same occupancy tag already exists for thecoordinate set. Other-wise it is created by the methodcoordinate set::add form. The coordinate is added by the method rotamer form::add coordinate. A rotamer energy is either computed from the probability of the occupancy as given for crystal structures or looked up from pre-calculated force-field energies according to a calculated torsion angle and assigned to therotamer form by the methodset energy.

The Coordinate Class

Thecoordinateclass represents a particular coordinate of anatom(Fig. 4.24). Functionality is inherited from the container class, even it is not thought to contain any components.

However, it uses the methods of the parent class for adding itself to and deleting itself from the superioratom object. The most important attributes are the three cartesian coordinates x,yandz, but also theoccupancyandbfactorare stored as they were given in the pdb-file.

The methodget energy by occupancy calculates an energy by inverting the equation of the

Figure 4.25. Thecoordinate setandrotamer formclass are derived from thecontainer class. Classes which are imported are linked by an arrow originating at the class name.

Boltzmann probability (see section 2.4.2). Most other methods are simple accessor methods, which can be generated automatically by thePerl module Class::MethodMakeras it is also done for attributes of other classes ofPerl Molecule.

The Charge Class

Thechargeclass represents a particular charge of anatom(Fig. 4.24). Functionality is inher-ited from thecontainerclass, even it is not thought to contain any components. However, it uses the methods of the parent class for adding itself to and deleting itself from the superior atom object. The most important attribute is the charge q, for which accessor methods are automatically generated byClass::MethodMaker.

The Coordinate Set Class

The coordinate setclass represents a particular coordinate set in a conformer (Fig. 4.25).

It contains objects of therotamer form andatomclass. Most of its functionality is inherited from the containerclass. The methodsadd atom andadd formadd components which are objects of theatomandrotamer formclass, respectively. The methoddeleteremovesatom components and also takes care, that thecoordinatecomponents of therotamer form ob-jects are removed. Perl Moleculeallows the definition of associations between charge sets and coordinate sets, i.e.,a certain rotamer form only occurs with a particular charge form. Usu-ally, all permutations between charge forms and rotamer forms are generated. Such cases can be described by fst-files inPerl Molecule scripts. The coordinate setclass has the at-tributeassociated charge setwith appropriate accessor functions to refer to the associated charge set.

Figure 4.26. Thecharge setandcharge formclass are derived from thecontainerclass.

Classes which are imported are linked by an arrow originating at the class name.

The Rotamer Form Class

Therotamer formclass represents a particular rotamer form in a coordinate set (Fig. 4.25).

It contains objects of the coordinate class. Most of its functionality is inherited from the containerclass. The attributeenergycontains the rotamer energy of the form. The method add coordinateadds objects of thecoordinateclass as components.

The Charge Set Class

Thecharge setclass represents a particular charge set in a conformer (Fig. 4.26). It contains objects of the charge form and atom class. Most of its functionality is inherited from the containerclass. Like thecoordinate setclass, it has theadd atomandadd formmethods to add its components. It has the attributeassociated coord setand accessor functions to refer to the associated coordinate set.

The Charge Form Class

The charge form class represents a particular charge form in a charge set (Fig. 4.26). It

The charge form class represents a particular charge form in a charge set (Fig. 4.26). It