--- trunk/tengDissertation/Appendix.tex 2006/06/23 20:21:54 2881 +++ trunk/tengDissertation/Appendix.tex 2006/06/23 21:33:52 2882 @@ -1,17 +1,17 @@ \appendix \chapter{\label{chapt:oopse}Object-Oriented Parallel Simulation Engine} -Absence of applying modern software development practices is the -bottleneck of Scientific Computing community\cite{Wilson2006}. In -the last 20 years , there are quite a few MD -packages\cite{Brooks1983, Vincent1995, Kale1999} that were developed -to solve common MD problems and perform robust simulations . -Unfortunately, most of them are commercial programs that are either -poorly written or extremely complicate. Consequently, it prevents -the researchers to reuse or extend those packages to do cutting-edge -research effectively. Along the way of studying structural and -dynamic processes in condensed phase systems like biological -membranes and nanoparticles, we developed an open source +The absence of modern software development practices has been a +bottleneck limiting progress in the Scientific Computing +community\cite{Wilson2006}. In the last 20 years , a large number of +few MD packages\cite{Brooks1983, Vincent1995, Kale1999} were +developed to solve common MD problems and perform robust simulations +. Most of these are commercial programs that are either poorly +written or extremely complicated to use correctly. This situation +prevents researchers from reusing or extending those packages to do +cutting-edge research effectively. In the process of studying +structural and dynamic processes in condensed phase systems like +biological membranes and nanoparticles, we developed an open source Object-Oriented Parallel Simulation Engine ({\sc OOPSE}). This new molecular dynamics package has some unique features \begin{enumerate} @@ -33,36 +33,36 @@ Mainly written by \texttt{C/C++} and \texttt{Fortran90 \section{\label{appendixSection:architecture }Architecture} -Mainly written by \texttt{C/C++} and \texttt{Fortran90}, {\sc OOPSE} -uses C++ Standard Template Library (STL) and fortran modules as the -foundation. As an extensive set of the STL and Fortran90 modules, -{\sc Base Classes} provide generic implementations of mathematical -objects (e.g., matrices, vectors, polynomials, random number -generators) and advanced data structures and algorithms(e.g., tuple, -bitset, generic data, string manipulation). The molecular data -structures for the representation of atoms, bonds, bends, torsions, -rigid bodies and molecules \textit{etc} are contained in the {\sc -Kernel} which is implemented with {\sc Base Classes} and are -carefully designed to provide maximum extensibility and flexibility. -The functionality required for applications is provide by the third -layer which contains Input/Output, Molecular Mechanics and Structure -modules. Input/Output module not only implements general methods for -file handling, but also defines a generic force field interface. -Another important component of Input/Output module is the meta-data -file parser, which is rewritten using ANother Tool for Language -Recognition(ANTLR)\cite{Parr1995, Schaps1999} syntax. The Molecular -Mechanics module consists of energy minimization and a wide -varieties of integration methods(see Chap.~\ref{chapt:methodology}). -The structure module contains a flexible and powerful selection -library which syntax is elaborated in -Sec.~\ref{appendixSection:syntax}. The top layer is made of the main -program of the package, \texttt{oopse} and it corresponding parallel -version \texttt{oopse\_MPI}, as well as other useful utilities, such -as \texttt{StatProps} (see Sec.~\ref{appendixSection:StaticProps}), -\texttt{DynamicProps} (see Sec.~\ref{appendixSection:DynamicProps}), -\texttt{Dump2XYZ} (see Sec.~\ref{appendixSection:Dump2XYZ}), -\texttt{Hydro} (see Sec.~\ref{appendixSection:hydrodynamics}) -\textit{etc}. +Mainly written by C++ and Fortran90, {\sc OOPSE} uses C++ Standard +Template Library (STL) and fortran modules as a foundation. As an +extensive set of the STL and Fortran90 modules, {\sc Base Classes} +provide generic implementations of mathematical objects (e.g., +matrices, vectors, polynomials, random number generators) and +advanced data structures and algorithms(e.g., tuple, bitset, generic +data and string manipulation). The molecular data structures for the +representation of atoms, bonds, bends, torsions, rigid bodies and +molecules \textit{etc} are contained in the {\sc Kernel} which is +implemented with {\sc Base Classes} and are carefully designed to +provide maximum extensibility and flexibility. The functionality +required for applications is provided by the third layer which +contains Input/Output, Molecular Mechanics and Structure modules. +The Input/Output module not only implements general methods for file +handling, but also defines a generic force field interface. Another +important component of Input/Output module is the parser for +meta-data files, which has been implemented using the ANother Tool +for Language Recognition(ANTLR)\cite{Parr1995, Schaps1999} syntax. +The Molecular Mechanics module consists of energy minimization and a +wide varieties of integration methods(see +Chap.~\ref{chapt:methodology}). The structure module contains a +flexible and powerful selection library which syntax is elaborated +in Sec.~\ref{appendixSection:syntax}. The top layer is made of the +main program of the package, \texttt{oopse} and it corresponding +parallel version \texttt{oopse\_MPI}, as well as other useful +utilities, such as \texttt{StatProps} (see +Sec.~\ref{appendixSection:StaticProps}), \texttt{DynamicProps} (see +Sec.~\ref{appendixSection:DynamicProps}), \texttt{Dump2XYZ} (see +Sec.~\ref{appendixSection:Dump2XYZ}), \texttt{Hydro} (see +Sec.~\ref{appendixSection:hydrodynamics}) \textit{etc}. \begin{figure} \centering @@ -71,7 +71,7 @@ of {\sc OOPSE}} \label{appendixFig:architecture} of {\sc OOPSE}} \label{appendixFig:architecture} \end{figure} -\section{\label{appendixSection:desginPattern}Design Pattern} +\section{\label{appendixSection:desginPattern}Design Patterns} Design patterns are optimal solutions to commonly-occurring problems in software design. Although originated as an architectural concept @@ -82,44 +82,68 @@ different problems as necessary. Pattern are expressiv the experience, knowledge and insights of developers who have successfully used these patterns in their own work. Patterns are reusable. They provide a ready-made solution that can be adapted to -different problems as necessary. Pattern are expressive. they -provide a common vocabulary of solutions that can express large -solutions succinctly. As one of the latest advanced techniques -emerged from object-oriented community, design patterns were applied -in some of the modern scientific software applications, such as -JMol, {\sc OOPSE}\cite{Meineke2005} and PROTOMOL\cite{Matthey2004} -\textit{etc}. The following sections enumerates some of the patterns -used in {\sc OOPSE}. +different problems as necessary. As one of the latest advanced +techniques to emerge from object-oriented community, design patterns +were applied in some of the modern scientific software applications, +such as JMol, {\sc OOPSE}\cite{Meineke2005} and +PROTOMOL\cite{Matthey2004} \textit{etc}. The following sections +enumerates some of the patterns used in {\sc OOPSE}. -\subsection{\label{appendixSection:singleton}Singleton} +\subsection{\label{appendixSection:singleton}Singletons} The Singleton pattern not only provides a mechanism to restrict instantiation of a class to one object, but also provides a global -point of access to the object. Currently implemented as a global -variable, the logging utility which reports error and warning -messages to the console in {\sc OOPSE} is a good candidate for -applying the Singleton pattern to avoid the global namespace -pollution. Although the singleton pattern can be implemented in -various ways to account for different aspects of the software -designs, such as lifespan control \textit{etc}, we only use the -static data approach in {\sc OOPSE}. The declaration and +point of access to the object. Although the singleton pattern can be +implemented in various ways to account for different aspects of the +software designs, such as lifespan control \textit{etc}, we only use +the static data approach in {\sc OOPSE}. The declaration and implementation of IntegratorFactory class are given by declared in List.~\ref{appendixScheme:singletonDeclaration} and Scheme.~\ref{appendixScheme:singletonImplementation} respectively. -Since constructor is declared as protected, a client can not +Since the constructor is declared as protected, a client can not instantiate IntegratorFactory directly. Moreover, since the member function getInstance serves as the only entry of access to IntegratorFactory, this approach fulfills the basic requirement, a single instance. Another consequence of this approach is the automatic destruction since static data are destroyed upon program termination. + +\subsection{\label{appendixSection:factoryMethod}Factory Methods} + +The Factory Method pattern is a creational pattern and deals with +the problem of creating objects without specifying the exact class +of object that will be created. Factory method is typically +implemented by delegating the creation operation to the subclasses. +Parameterized Factory pattern where factory method ( +createIntegrator member function) creates products based on the +identifier (see Scheme.~\ref{appendixScheme:factoryDeclaration}). If +the identifier has been already registered, the factory method will +invoke the corresponding creator (see +Scheme.~\ref{appendixScheme:integratorCreator}) which utilizes the +modern C++ template technique to avoid excess subclassing. + +\subsection{\label{appendixSection:visitorPattern}Visitor} + +The visitor pattern is designed to decouple the data structure and +algorithms used upon them by collecting related operation from +element classes into other visitor classes, which is equivalent to +adding virtual functions into a set of classes without modifying +their interfaces. Fig.~\ref{appendixFig:visitorUML} demonstrates the +structure of a Visitor pattern which is used extensively in {\tt +Dump2XYZ}. In order to convert an OOPSE dump file, a series of +distinct operations are performed on different StuntDoubles (See the +class hierarchy in Fig.~\ref{oopseFig:hierarchy} and the declaration +in Scheme.~\ref{appendixScheme:element}). Since the hierarchies +remain stable, it is easy to define a visit operation (see +Scheme.~\ref{appendixScheme:visitor}) for each class of StuntDouble. +Note that using Composite pattern\cite{Gamma1994}, CompositeVisitor +manages a priority visitor list and handles the execution of every +visitor in the priority list on different StuntDoubles. + \begin{lstlisting}[float,caption={[A classic Singleton design pattern implementation(I)] The declaration of of simple Singleton pattern.},label={appendixScheme:singletonDeclaration}] -class IntegratorFactory { -public: - static IntegratorFactory* - getInstance(); -protected: +class IntegratorFactory { public: + static IntegratorFactory* getInstance(); protected: IntegratorFactory(); private: static IntegratorFactory* instance_; @@ -140,25 +164,9 @@ IntegratorFactory* getInstance() { \end{lstlisting} - -\subsection{\label{appendixSection:factoryMethod}Factory Method} - -Categoried as a creational pattern, the Factory Method pattern deals -with the problem of creating objects without specifying the exact -class of object that will be created. Factory Method is typically -implemented by delegating the creation operation to the subclasses. -Parameterized Factory pattern where factory method ( -createIntegrator member function) creates products based on the -identifier (see Scheme.~\ref{appendixScheme:factoryDeclaration}). If -the identifier has been already registered, the factory method will -invoke the corresponding creator (see -Scheme.~\ref{appendixScheme:integratorCreator}) which utilizes the -modern C++ template technique to avoid excess subclassing. - \begin{lstlisting}[float,caption={[The implementation of Parameterized Factory pattern (I)]Source code of IntegratorFactory class.},label={appendixScheme:factoryDeclaration}] -class IntegratorFactory { -public: +class IntegratorFactory { public: typedef std::map CreatorMapType; bool registerIntegrator(IntegratorCreator* creator) { @@ -182,7 +190,7 @@ class IntegratorCreator { (public) \begin{lstlisting}[float,caption={[The implementation of Parameterized Factory pattern (III)]Source code of creator classes.},label={appendixScheme:integratorCreator}] class IntegratorCreator { -public: + public: IntegratorCreator(const string& ident) : ident_(ident) {} const string& getIdent() const { return ident_; } @@ -193,71 +201,43 @@ template string ident_; }; -template -class IntegratorBuilder : public IntegratorCreator { -public: - IntegratorBuilder(const string& ident) - : IntegratorCreator(ident) {} - virtual Integrator* create(SimInfo* info) const { - return new ConcreteIntegrator(info); - } +template class IntegratorBuilder : public +IntegratorCreator { + public: + IntegratorBuilder(const string& ident) + : IntegratorCreator(ident) {} + virtual Integrator* create(SimInfo* info) const { + return new ConcreteIntegrator(info); + } }; \end{lstlisting} -\subsection{\label{appendixSection:visitorPattern}Visitor} - -The visitor pattern is designed to decouple the data structure and -algorithms used upon them by collecting related operation from -element classes into other visitor classes, which is equivalent to -adding virtual functions into a set of classes without modifying -their interfaces. Fig.~\ref{appendixFig:visitorUML} demonstrates the -structure of Visitor pattern which is used extensively in {\tt -Dump2XYZ}. In order to convert an OOPSE dump file, a series of -distinct operations are performed on different StuntDoubles (See the -class hierarchy in Fig.~\ref{oopseFig:hierarchy} and the declaration -in Scheme.~\ref{appendixScheme:element}). Since the hierarchies -remains stable, it is easy to define a visit operation (see -Scheme.~\ref{appendixScheme:visitor}) for each class of StuntDouble. -Note that using Composite pattern\cite{Gamma1994}, CompositVisitor -manages a priority visitor list and handles the execution of every -visitor in the priority list on different StuntDoubles. - -\begin{figure} -\centering -\includegraphics[width=\linewidth]{visitor.eps} -\caption[The UML class diagram of Visitor patten] {The UML class -diagram of Visitor patten.} \label{appendixFig:visitorUML} -\end{figure} - -\begin{figure} -\centering -\includegraphics[width=\linewidth]{hierarchy.eps} -\caption[Class hierarchy for ojects in {\sc OOPSE}]{ A diagram of -the class hierarchy. } \label{oopseFig:hierarchy} -\end{figure} - \begin{lstlisting}[float,caption={[The implementation of Visitor pattern (II)]Source code of the element classes.},label={appendixScheme:element}] -class StuntDouble { public: - virtual void accept(BaseVisitor* v) = 0; +class StuntDouble { + public: + virtual void accept(BaseVisitor* v) = 0; }; -class Atom: public StuntDouble { public: - virtual void accept{BaseVisitor* v*} { - v->visit(this); - } +class Atom: public StuntDouble { + public: + virtual void accept{BaseVisitor* v*} { + v->visit(this); + } }; -class DirectionalAtom: public Atom { public: - virtual void accept{BaseVisitor* v*} { - v->visit(this); - } +class DirectionalAtom: public Atom { + public: + virtual void accept{BaseVisitor* v*} { + v->visit(this); + } }; -class RigidBody: public StuntDouble { public: - virtual void accept{BaseVisitor* v*} { - v->visit(this); - } +class RigidBody: public StuntDouble { + public: + virtual void accept{BaseVisitor* v*} { + v->visit(this); + } }; \end{lstlisting} @@ -265,21 +245,21 @@ class BaseVisitor{ (public) \begin{lstlisting}[float,caption={[The implementation of Visitor pattern (I)]Source code of the visitor classes.},label={appendixScheme:visitor}] class BaseVisitor{ -public: - virtual void visit(Atom* atom); - virtual void visit(DirectionalAtom* datom); - virtual void visit(RigidBody* rb); + public: + virtual void visit(Atom* atom); + virtual void visit(DirectionalAtom* datom); + virtual void visit(RigidBody* rb); }; -class BaseAtomVisitor:public BaseVisitor{ public: - virtual void visit(Atom* atom); - virtual void visit(DirectionalAtom* datom); - virtual void visit(RigidBody* rb); +class BaseAtomVisitor:public BaseVisitor{ + public: + virtual void visit(Atom* atom); + virtual void visit(DirectionalAtom* datom); + virtual void visit(RigidBody* rb); }; class CompositeVisitor: public BaseVisitor { -public: - + public: typedef list > VistorListType; typedef VistorListType::iterator VisitorListIterator; virtual void visit(Atom* atom) { @@ -303,20 +283,35 @@ class CompositeVisitor: public BaseVisitor { (public) std::vector myAtoms; std::vector::iterator ai; myAtoms = rb->getAtoms(); - for(i = visitorScheme.begin();i != visitorScheme.end();++i) {{ + for(i = visitorScheme.begin();i != visitorScheme.end();++i) { rb->accept(*i); for(ai = myAtoms.begin(); ai != myAtoms.end(); ++ai){ (*ai)->accept(*i); + } } - } void addVisitor(BaseVisitor* v, int priority); - protected: VistorListType visitorList; }; \end{lstlisting} +\begin{figure} +\centering +\includegraphics[width=\linewidth]{visitor.eps} +\caption[The UML class diagram of Visitor patten] {The UML class +diagram of Visitor patten.} \label{appendixFig:visitorUML} +\end{figure} + +\begin{figure} +\centering +\includegraphics[width=\linewidth]{hierarchy.eps} +\caption[Class hierarchy for ojects in {\sc OOPSE}]{ A diagram of +the class hierarchy. Objects below others on the diagram inherit +data structures and functions from their parent classes above them.} +\label{oopseFig:hierarchy} +\end{figure} + \section{\label{appendixSection:concepts}Concepts} OOPSE manipulates both traditional atoms as well as some objects @@ -325,7 +320,7 @@ specified in the {\tt .md} file. In contrast, RigidBod freedom. A diagram of the class hierarchy is illustrated in Fig.~\ref{oopseFig:hierarchy}. Every Molecule, Atom and DirectionalAtom in {\sc OOPSE} have their own names which are -specified in the {\tt .md} file. In contrast, RigidBodies are +specified in the meta data file. In contrast, RigidBodies are denoted by their membership and index inside a particular molecule: [MoleculeName]\_RB\_[index] (the contents inside the brackets depend on the specifics of the simulation). The names of rigid bodies are @@ -496,12 +491,21 @@ numbers of selected stuntdobules from {\tt -{}-sele1} \caption[A representation of the three-stage correlations in \texttt{StaticProps}]{This diagram illustrates three-stage processing used by \texttt{StaticProps}. $S_1$ and $S_2$ are the -numbers of selected stuntdobules from {\tt -{}-sele1} and {\tt --{}-sele2} respectively, while $C$ is the number of stuntdobules +numbers of selected StuntDobules from {\tt -{}-sele1} and {\tt +-{}-sele2} respectively, while $C$ is the number of StuntDobules appearing at both sets. The first stage($S_1-C$ and $S_2$) and second stages ($S_1$ and $S_2-C$) are completely non-overlapping. On the contrary, the third stage($C$ and $C$) are completely overlapping} \label{oopseFig:staticPropsProcess} +\end{figure} + +\begin{figure} +\centering +\includegraphics[width=3in]{definition.eps} +\caption[Definitions of the angles between directional objects]{Any +two directional objects (DirectionalAtoms and RigidBodies) have a +set of two angles ($\theta$, and $\omega$) between the z-axes of +their body-fixed frames.} \label{oopseFig:gofr} \end{figure} There are five seperate radial distribution functions availiable in @@ -548,17 +552,8 @@ Fig.~\ref{oopseFig:gofr} The vectors (and angles) associated with these angular pair distribution functions are most easily seen in -Fig.~\ref{oopseFig:gofr} +Fig.~\ref{oopseFig:gofr}. -\begin{figure} -\centering -\includegraphics[width=3in]{definition.eps} -\caption[Definitions of the angles between directional objects]{ \\ -Any two directional objects (DirectionalAtoms and RigidBodies) have -a set of two angles ($\theta$, and $\omega$) between the z-axes of -their body-fixed frames.} \label{oopseFig:gofr} -\end{figure} - The options available for {\tt StaticProps} are as follows: \begin{longtable}[c]{|EFG|} \caption{StaticProps Command-line Options} @@ -623,9 +618,9 @@ trajectories, \texttt{dynamicProps} will estimate the For large simulations, the trajectory files can sometimes reach sizes in excess of several gigabytes. In order to prevent a situation where the program runs out of memory due to large -trajectories, \texttt{dynamicProps} will estimate the size of free -memory at first, and determine the number of frames in each block, -which allows the operating system to load two blocks of data +trajectories, \texttt{dynamicProps} will first estimate the size of +free memory, and determine the number of frames in each block, which +will allow the operating system to load two blocks of data simultaneously without swapping. Upon reading two blocks of the trajectory, \texttt{dynamicProps} will calculate the time correlation within the first block and the cross correlations