--- trunk/tengDissertation/Appendix.tex 2006/06/07 02:24:27 2811 +++ trunk/tengDissertation/Appendix.tex 2006/06/22 22:19:02 2880 @@ -1,20 +1,76 @@ \appendix -\chapter{\label{chapt:appendix}APPENDIX} +\chapter{\label{chapt:oopse}Object-Oriented Parallel Simulation Engine} -Designing object-oriented software is hard, and designing reusable -object-oriented scientific software is even harder. Absence of -applying modern software development practices is the bottleneck of -Scientific Computing community\cite{Wilson}. For instance, in the -last 20 years , there are quite a few MD packages that were -developed to solve common MD problems and perform robust simulations -. However, many of the codes are legacy programs that are either -poorly organized or extremely complex. Usually, these packages were -contributed by scientists without official computer science -training. The development of most MD applications are lack of strong -coordination to enforce design and programming guidelines. Moreover, -most MD programs also suffer from missing design and implement -documents which is crucial to the maintenance and extensibility. +Absence of applying modern software development practices is the +bottleneck of Scientific Computing community\cite{Wilson2006}. In +the last 20 years , there are quite a few MD +packages\cite{Brooks1983, Vincent1995, Kale1999} that were developed +to solve common MD problems and perform robust simulations . +Unfortunately, most of them are commercial programs that are either +poorly written or extremely complicate. Consequently, it prevents +the researchers to reuse or extend those packages to do cutting-edge +research effectively. Along the way of studying structural and +dynamic processes in condensed phase systems like biological +membranes and nanoparticles, we developed an open source +Object-Oriented Parallel Simulation Engine ({\sc OOPSE}). This new +molecular dynamics package has some unique features +\begin{enumerate} + \item {\sc OOPSE} performs Molecular Dynamics (MD) simulations on non-standard +atom types (transition metals, point dipoles, sticky potentials, +Gay-Berne ellipsoids, or other "lumpy"atoms with orientational +degrees of freedom), as well as rigid bodies. + \item {\sc OOPSE} uses a force-based decomposition algorithm using MPI on cheap +Beowulf clusters to obtain very efficient parallelism. + \item {\sc OOPSE} integrates the equations of motion using advanced methods for +orientational dynamics in NVE, NVT, NPT, NPAT, and NP$\gamma$T +ensembles. + \item {\sc OOPSE} can carry out simulations on metallic systems using the +Embedded Atom Method (EAM) as well as the Sutton-Chen potential. + \item {\sc OOPSE} can perform simulations on Gay-Berne liquid crystals. + \item {\sc OOPSE} can simulate systems containing the extremely efficient +extended-Soft Sticky Dipole (SSD/E) model for water. +\end{enumerate} +\section{\label{appendixSection:architecture }Architecture} + +Mainly written by \texttt{C/C++} and \texttt{Fortran90}, {\sc OOPSE} +uses C++ Standard Template Library (STL) and fortran modules as the +foundation. As an extensive set of the STL and Fortran90 modules, +{\sc Base Classes} provide generic implementations of mathematical +objects (e.g., matrices, vectors, polynomials, random number +generators) and advanced data structures and algorithms(e.g., tuple, +bitset, generic data, string manipulation). The molecular data +structures for the representation of atoms, bonds, bends, torsions, +rigid bodies and molecules \textit{etc} are contained in the {\sc +Kernel} which is implemented with {\sc Base Classes} and are +carefully designed to provide maximum extensibility and flexibility. +The functionality required for applications is provide by the third +layer which contains Input/Output, Molecular Mechanics and Structure +modules. Input/Output module not only implements general methods for +file handling, but also defines a generic force field interface. +Another important component of Input/Output module is the meta-data +file parser, which is rewritten using ANother Tool for Language +Recognition(ANTLR)\cite{Parr1995, Schaps1999} syntax. The Molecular +Mechanics module consists of energy minimization and a wide +varieties of integration methods(see Chap.~\ref{chapt:methodology}). +The structure module contains a flexible and powerful selection +library which syntax is elaborated in +Sec.~\ref{appendixSection:syntax}. The top layer is made of the main +program of the package, \texttt{oopse} and it corresponding parallel +version \texttt{oopse\_MPI}, as well as other useful utilities, such +as \texttt{StatProps} (see Sec.~\ref{appendixSection:StaticProps}), +\texttt{DynamicProps} (see Sec.~\ref{appendixSection:DynamicProps}), +\texttt{Dump2XYZ} (see Sec.~\ref{appendixSection:Dump2XYZ}), +\texttt{Hydro} (see Sec.~\ref{appendixSection:hydrodynamics}) +\textit{etc}. + +\begin{figure} +\centering +\includegraphics[width=\linewidth]{architecture.eps} +\caption[The architecture of {\sc OOPSE}] {Overview of the structure +of {\sc OOPSE}} \label{appendixFig:architecture} +\end{figure} + \section{\label{appendixSection:desginPattern}Design Pattern} Design patterns are optimal solutions to commonly-occurring problems @@ -28,70 +84,253 @@ solutions succinctly. reusable. They provide a ready-made solution that can be adapted to different problems as necessary. Pattern are expressive. they provide a common vocabulary of solutions that can express large -solutions succinctly. +solutions succinctly. As one of the latest advanced techniques +emerged from object-oriented community, design patterns were applied +in some of the modern scientific software applications, such as +JMol, {\sc OOPSE}\cite{Meineke2005} and PROTOMOL\cite{Matthey2004} +\textit{etc}. The following sections enumerates some of the patterns +used in {\sc OOPSE}. -Patterns are usually described using a format that includes the -following information: -\begin{enumerate} - \item The \emph{name} that is commonly used for the pattern. Good pattern names form a vocabulary for - discussing conceptual abstractions. a pattern may have more than one commonly used or recognizable name - in the literature. In this case it is common practice to document these nicknames or synonyms under - the heading of \emph{Aliases} or \emph{Also Known As}. - \item The \emph{motivation} or \emph{context} that this pattern applies - to. Sometimes, it will include some prerequisites that should be satisfied before deciding to use a pattern - \item The \emph{solution} to the problem that the pattern - addresses. It describes how to construct the necessary work products. The description may include - pictures, diagrams and prose which identify the pattern's structure, its participants, and their - collaborations, to show how the problem is solved. - \item The \emph{consequences} of using the given solution to solve a - problem, both positive and negative. -\end{enumerate} +\subsection{\label{appendixSection:singleton}Singleton} -As one of the latest advanced techniques emerged from -object-oriented community, design patterns were applied in some of -the modern scientific software applications, such as JMol, OOPSE -\cite{Meineke05} and PROTOMOL \cite{Matthey05} \textit{etc}. +The Singleton pattern not only provides a mechanism to restrict +instantiation of a class to one object, but also provides a global +point of access to the object. Currently implemented as a global +variable, the logging utility which reports error and warning +messages to the console in {\sc OOPSE} is a good candidate for +applying the Singleton pattern to avoid the global namespace +pollution. Although the singleton pattern can be implemented in +various ways to account for different aspects of the software +designs, such as lifespan control \textit{etc}, we only use the +static data approach in {\sc OOPSE}. The declaration and +implementation of IntegratorFactory class are given by declared in +List.~\ref{appendixScheme:singletonDeclaration} and +Scheme.~\ref{appendixScheme:singletonImplementation} respectively. +Since constructor is declared as protected, a client can not +instantiate IntegratorFactory directly. Moreover, since the member +function getInstance serves as the only entry of access to +IntegratorFactory, this approach fulfills the basic requirement, a +single instance. Another consequence of this approach is the +automatic destruction since static data are destroyed upon program +termination. +\begin{lstlisting}[float,caption={[A classic Singleton design pattern implementation(I)] The declaration of of simple Singleton pattern.},label={appendixScheme:singletonDeclaration}] -\subsection{\label{appendixSection:singleton}Singleton} -The Singleton pattern ensures that only one instance of a class is -created. All objects that use an instance of that class use the same -instance. +class IntegratorFactory { +public: + static IntegratorFactory* + getInstance(); +protected: + IntegratorFactory(); +private: + static IntegratorFactory* instance_; +}; +\end{lstlisting} + +\begin{lstlisting}[float,caption={[A classic implementation of Singleton design pattern (II)] The implementation of simple Singleton pattern.},label={appendixScheme:singletonImplementation}] + +IntegratorFactory::instance_ = NULL; + +IntegratorFactory* getInstance() { + if (instance_ == NULL){ + instance_ = new IntegratorFactory; + } + return instance_; +} + +\end{lstlisting} + + \subsection{\label{appendixSection:factoryMethod}Factory Method} -The Factory Method pattern is a creational pattern which deals with -the problem of creating objects without specifying the exact class -of object that will be created. Factory Method solves this problem -by defining a separate method for creating the objects, which -subclasses can then override to specify the derived type of product -that will be created. +Categoried as a creational pattern, the Factory Method pattern deals +with the problem of creating objects without specifying the exact +class of object that will be created. Factory Method is typically +implemented by delegating the creation operation to the subclasses. +Parameterized Factory pattern where factory method ( +createIntegrator member function) creates products based on the +identifier (see Scheme.~\ref{appendixScheme:factoryDeclaration}). If +the identifier has been already registered, the factory method will +invoke the corresponding creator (see +Scheme.~\ref{appendixScheme:integratorCreator}) which utilizes the +modern C++ template technique to avoid excess subclassing. -\subsection{\label{appendixSection:visitorPattern}Visitor} -The purpose of the Visitor Pattern is to encapsulate an operation -that you want to perform on the elements of a data structure. In -this way, you can change the operation being performed on a -structure without the need of changing the classes of the elements -that you are operating on. +\begin{lstlisting}[float,caption={[The implementation of Parameterized Factory pattern (I)]Source code of IntegratorFactory class.},label={appendixScheme:factoryDeclaration}] +class IntegratorFactory { +public: + typedef std::map CreatorMapType; -\subsection{\label{appendixSection:templateMethod}Template Method} + bool registerIntegrator(IntegratorCreator* creator) { + return creatorMap_.insert(creator->getIdent(), creator).second; + } + Integrator* createIntegrator(const string& id, SimInfo* info) { + Integrator* result = NULL; + CreatorMapType::iterator i = creatorMap_.find(id); + if (i != creatorMap_.end()) { + result = (i->second)->create(info); + } + return result; + } + +private: + CreatorMapType creatorMap_; +}; +\end{lstlisting} + +\begin{lstlisting}[float,caption={[The implementation of Parameterized Factory pattern (III)]Source code of creator classes.},label={appendixScheme:integratorCreator}] + +class IntegratorCreator { +public: + IntegratorCreator(const string& ident) : ident_(ident) {} + + const string& getIdent() const { return ident_; } + + virtual Integrator* create(SimInfo* info) const = 0; + +private: + string ident_; +}; + +template +class IntegratorBuilder : public IntegratorCreator { +public: + IntegratorBuilder(const string& ident) + : IntegratorCreator(ident) {} + virtual Integrator* create(SimInfo* info) const { + return new ConcreteIntegrator(info); + } +}; +\end{lstlisting} + +\subsection{\label{appendixSection:visitorPattern}Visitor} + +The visitor pattern is designed to decouple the data structure and +algorithms used upon them by collecting related operation from +element classes into other visitor classes, which is equivalent to +adding virtual functions into a set of classes without modifying +their interfaces. Fig.~\ref{appendixFig:visitorUML} demonstrates the +structure of Visitor pattern which is used extensively in {\tt +Dump2XYZ}. In order to convert an OOPSE dump file, a series of +distinct operations are performed on different StuntDoubles (See the +class hierarchy in Fig.~\ref{oopseFig:hierarchy} and the declaration +in Scheme.~\ref{appendixScheme:element}). Since the hierarchies +remains stable, it is easy to define a visit operation (see +Scheme.~\ref{appendixScheme:visitor}) for each class of StuntDouble. +Note that using Composite pattern\cite{Gamma1994}, CompositVisitor +manages a priority visitor list and handles the execution of every +visitor in the priority list on different StuntDoubles. + +\begin{figure} +\centering +\includegraphics[width=\linewidth]{visitor.eps} +\caption[The UML class diagram of Visitor patten] {The UML class +diagram of Visitor patten.} \label{appendixFig:visitorUML} +\end{figure} + +\begin{figure} +\centering +\includegraphics[width=\linewidth]{hierarchy.eps} +\caption[Class hierarchy for ojects in {\sc OOPSE}]{ A diagram of +the class hierarchy. } \label{oopseFig:hierarchy} +\end{figure} + +\begin{lstlisting}[float,caption={[The implementation of Visitor pattern (II)]Source code of the element classes.},label={appendixScheme:element}] + +class StuntDouble { public: + virtual void accept(BaseVisitor* v) = 0; +}; + +class Atom: public StuntDouble { public: + virtual void accept{BaseVisitor* v*} { + v->visit(this); + } +}; + +class DirectionalAtom: public Atom { public: + virtual void accept{BaseVisitor* v*} { + v->visit(this); + } +}; + +class RigidBody: public StuntDouble { public: + virtual void accept{BaseVisitor* v*} { + v->visit(this); + } +}; + +\end{lstlisting} + +\begin{lstlisting}[float,caption={[The implementation of Visitor pattern (I)]Source code of the visitor classes.},label={appendixScheme:visitor}] + +class BaseVisitor{ +public: + virtual void visit(Atom* atom); + virtual void visit(DirectionalAtom* datom); + virtual void visit(RigidBody* rb); +}; + +class BaseAtomVisitor:public BaseVisitor{ public: + virtual void visit(Atom* atom); + virtual void visit(DirectionalAtom* datom); + virtual void visit(RigidBody* rb); +}; + +class CompositeVisitor: public BaseVisitor { +public: + + typedef list > VistorListType; + typedef VistorListType::iterator VisitorListIterator; + virtual void visit(Atom* atom) { + VisitorListIterator i; + BaseVisitor* curVisitor; + for(i = visitorScheme.begin();i != visitorScheme.end();++i) { + atom->accept(*i); + } + } + + virtual void visit(DirectionalAtom* datom) { + VisitorListIterator i; + BaseVisitor* curVisitor; + for(i = visitorScheme.begin();i != visitorScheme.end();++i) { + atom->accept(*i); + } + } + + virtual void visit(RigidBody* rb) { + VisitorListIterator i; + std::vector myAtoms; + std::vector::iterator ai; + myAtoms = rb->getAtoms(); + for(i = visitorScheme.begin();i != visitorScheme.end();++i) {{ + rb->accept(*i); + for(ai = myAtoms.begin(); ai != myAtoms.end(); ++ai){ + (*ai)->accept(*i); + } + } + + void addVisitor(BaseVisitor* v, int priority); + + protected: + VistorListType visitorList; +}; +\end{lstlisting} + \section{\label{appendixSection:concepts}Concepts} OOPSE manipulates both traditional atoms as well as some objects that {\it behave like atoms}. These objects can be rigid collections of atoms or atoms which have orientational degrees of -freedom. Here is a diagram of the class heirarchy: - -%\begin{figure} -%\centering -%\includegraphics[width=3in]{heirarchy.eps} -%\caption[Class heirarchy for StuntDoubles in {\sc oopse}-3.0]{ \\ -%The class heirarchy of StuntDoubles in {\sc oopse}-3.0. The -%selection syntax allows the user to select any of the objects that -%are descended from a StuntDouble.} \label{oopseFig:heirarchy} -%\end{figure} - +freedom. A diagram of the class hierarchy is illustrated in +Fig.~\ref{oopseFig:hierarchy}. Every Molecule, Atom and +DirectionalAtom in {\sc OOPSE} have their own names which are +specified in the {\tt .md} file. In contrast, RigidBodies are +denoted by their membership and index inside a particular molecule: +[MoleculeName]\_RB\_[index] (the contents inside the brackets depend +on the specifics of the simulation). The names of rigid bodies are +generated automatically. For example, the name of the first rigid +body in a DMPC molecule is DMPC\_RB\_0. \begin{itemize} \item A {\bf StuntDouble} is {\it any} object that can be manipulated by the integrators and minimizers. @@ -101,21 +340,15 @@ Every Molecule, Atom and DirectionalAtom in {\sc oopse DirectionalAtom}s which behaves as a single unit. \end{itemize} -Every Molecule, Atom and DirectionalAtom in {\sc oopse} have their -own names which are specified in the {\tt .md} file. In contrast, -RigidBodies are denoted by their membership and index inside a -particular molecule: [MoleculeName]\_RB\_[index] (the contents -inside the brackets depend on the specifics of the simulation). The -names of rigid bodies are generated automatically. For example, the -name of the first rigid body in a DMPC molecule is DMPC\_RB\_0. - \section{\label{appendixSection:syntax}Syntax of the Select Command} -The most general form of the select command is: {\tt select {\it -expression}} +{\sc OOPSE} provides a powerful selection utility to select +StuntDoubles. The most general form of the select command is: +{\tt select {\it expression}}. + This expression represents an arbitrary set of StuntDoubles (Atoms -or RigidBodies) in {\sc oopse}. Expressions are composed of either +or RigidBodies) in {\sc OOPSE}. Expressions are composed of either name expressions, index expressions, predefined sets, user-defined expressions, comparison operators, within expressions, or logical combinations of the above expression types. Expressions can be @@ -202,10 +435,9 @@ expression}} Users can define arbitrary terms to represent groups of StuntDoubles, and then use the define terms in select commands. The general form for the define command is: {\bf define {\it term -expression}} +expression}}. Once defined, the user can specify such terms in +boolean expressions -Once defined, the user can specify such terms in boolean expressions - {\tt define SSDWATER SSD or SSD1 or SSDRF} {\tt select SSDWATER} @@ -250,10 +482,27 @@ and other atoms of type $B$, $g_{AB}(r)$. StaticProps some or all of the configurations that are contained within a dump file. The most common example of a static property that can be computed is the pair distribution function between atoms of type $A$ -and other atoms of type $B$, $g_{AB}(r)$. StaticProps can also be -used to compute the density distributions of other molecules in a -reference frame {\it fixed to the body-fixed reference frame} of a -selected atom or rigid body. +and other atoms of type $B$, $g_{AB}(r)$. {\tt StaticProps} can +also be used to compute the density distributions of other molecules +in a reference frame {\it fixed to the body-fixed reference frame} +of a selected atom or rigid body. Due to the fact that the selected +StuntDoubles from two selections may be overlapped, {\tt +StaticProps} performs the calculation in three stages which are +illustrated in Fig.~\ref{oopseFig:staticPropsProcess}. + +\begin{figure} +\centering +\includegraphics[width=\linewidth]{staticPropsProcess.eps} +\caption[A representation of the three-stage correlations in +\texttt{StaticProps}]{This diagram illustrates three-stage +processing used by \texttt{StaticProps}. $S_1$ and $S_2$ are the +numbers of selected stuntdobules from {\tt -{}-sele1} and {\tt +-{}-sele2} respectively, while $C$ is the number of stuntdobules +appearing at both sets. The first stage($S_1-C$ and $S_2$) and +second stages ($S_1$ and $S_2-C$) are completely non-overlapping. On +the contrary, the third stage($C$ and $C$) are completely +overlapping} \label{oopseFig:staticPropsProcess} +\end{figure} There are five seperate radial distribution functions availiable in OOPSE. Since every radial distrbution function invlove the @@ -298,7 +547,8 @@ distribution functions are most easily seen in the fig \end{description} The vectors (and angles) associated with these angular pair -distribution functions are most easily seen in the figure below: +distribution functions are most easily seen in +Fig.~\ref{oopseFig:gofr} \begin{figure} \centering @@ -370,6 +620,33 @@ The options available for DynamicProps are as follows: different vectors). The ability to use two selection scripts to select different types of atoms is already present in the code. +For large simulations, the trajectory files can sometimes reach +sizes in excess of several gigabytes. In order to prevent a +situation where the program runs out of memory due to large +trajectories, \texttt{dynamicProps} will estimate the size of free +memory at first, and determine the number of frames in each block, +which allows the operating system to load two blocks of data +simultaneously without swapping. Upon reading two blocks of the +trajectory, \texttt{dynamicProps} will calculate the time +correlation within the first block and the cross correlations +between the two blocks. This second block is then freed and then +incremented and the process repeated until the end of the +trajectory. Once the end is reached, the first block is freed then +incremented, until all frame pairs have been correlated in time. +This process is illustrated in +Fig.~\ref{oopseFig:dynamicPropsProcess}. + +\begin{figure} +\centering +\includegraphics[width=\linewidth]{dynamicPropsProcess.eps} +\caption[A representation of the block correlations in +\texttt{dynamicProps}]{This diagram illustrates block correlations +processing in \texttt{dynamicProps}. The shaded region represents +the self correlation of the block, and the open blocks are read one +at a time and the cross correlations between blocks are calculated.} +\label{oopseFig:dynamicPropsProcess} +\end{figure} + The options available for DynamicProps are as follows: \begin{longtable}[c]{|EFG|} \caption{DynamicProps Command-line Options} @@ -396,9 +673,10 @@ Dump2XYZ can transform an OOPSE dump file into a xyz f \subsection{\label{appendixSection:Dump2XYZ}Dump2XYZ} -Dump2XYZ can transform an OOPSE dump file into a xyz file which can -be opened by other molecular dynamics viewers such as Jmol and VMD. -The options available for Dump2XYZ are as follows: +{\tt Dump2XYZ} can transform an OOPSE dump file into a xyz file +which can be opened by other molecular dynamics viewers such as Jmol +and VMD\cite{Humphrey1996}. The options available for Dump2XYZ are +as follows: \begin{longtable}[c]{|EFG|} @@ -428,8 +706,13 @@ converted. \\ & {\tt -{}-refsele} & In order to rotate the system, {\tt -{}-originsele} and {\tt -{}-refsele} must be given to define the new coordinate set. A StuntDouble which contains a dipole (the direction of the dipole is always (0, 0, 1) in body frame) is specified by {\tt -{}-originsele}. The new x-z plane is defined by the direction of the dipole and the StuntDouble is specified by {\tt -{}-refsele}. \end{longtable} -\subsection{\label{appendixSection:hydrodynamics}Hydrodynamics} +\subsection{\label{appendixSection:hydrodynamics}Hydro} +{\tt Hydro} can calculate resistance and diffusion tensors at the +center of resistance. Both tensors at the center of diffusion can +also be reported from the program, as well as the coordinates for +the beads which are used to approximate the arbitrary shapes. The +options available for Hydro are as follows: \begin{longtable}[c]{|EFG|} \caption{Hydrodynamics Command-line Options} \\ \hline @@ -442,5 +725,5 @@ converted. \\ -i & {\tt -{}-input} & input dump file \\ -o & {\tt -{}-output} & output file prefix (default=`hydro') \\ -b & {\tt -{}-beads} & generate the beads only, hydrodynamics calculation will not be performed (default=off)\\ - & {\tt -{}-model} & hydrodynamics model (support ``AnalyticalModel'', ``RoughShell'' and ``BeadModel'') \\ + & {\tt -{}-model} & hydrodynamics model (supports ``AnalyticalModel'', ``RoughShell'' and ``BeadModel'') \\ \end{longtable}