--- trunk/tengDissertation/Appendix.tex 2006/06/08 07:34:15 2823 +++ trunk/tengDissertation/Appendix.tex 2006/06/09 04:07:13 2842 @@ -1,24 +1,19 @@ \appendix \chapter{\label{chapt:oopse}Object-Oriented Parallel Simulation Engine} -Designing object-oriented software is hard, and designing reusable -object-oriented scientific software is even harder. Absence of -applying modern software development practices is the bottleneck of -Scientific Computing community\cite{Wilson2006}. For instance, in -the last 20 years , there are quite a few MD packages that were -developed to solve common MD problems and perform robust simulations -. However, many of the codes are legacy programs that are either -poorly organized or extremely complex. Usually, these packages were -contributed by scientists without official computer science -training. The development of most MD applications are lack of strong -coordination to enforce design and programming guidelines. Moreover, -most MD programs also suffer from missing design and implement -documents which is crucial to the maintenance and extensibility. -Along the way of studying structural and dynamic processes in -condensed phase systems like biological membranes and nanoparticles, -we developed and maintained an Object-Oriented Parallel Simulation -Engine ({\sc OOPSE}). This new molecular dynamics package has some -unique features +Absence of applying modern software development practices is the +bottleneck of Scientific Computing community\cite{Wilson2006}. In +the last 20 years , there are quite a few MD +packages\cite{Brooks1983, Vincent1995, Kale1999} that were developed +to solve common MD problems and perform robust simulations . +Unfortunately, most of them are commercial programs that are either +poorly written or extremely complicate. Consequently, it prevents +the researchers to reuse or extend those packages to do cutting-edge +research effectively. Along the way of studying structural and +dynamic processes in condensed phase systems like biological +membranes and nanoparticles, we developed an open source +Object-Oriented Parallel Simulation Engine ({\sc OOPSE}). This new +molecular dynamics package has some unique features \begin{enumerate} \item {\sc OOPSE} performs Molecular Dynamics (MD) simulations on non-standard atom types (transition metals, point dipoles, sticky potentials, @@ -64,11 +59,9 @@ as \texttt{StatProps} (see Sec.~\ref{appendixSection:S program of the package, \texttt{oopse} and it corresponding parallel version \texttt{oopse\_MPI}, as well as other useful utilities, such as \texttt{StatProps} (see Sec.~\ref{appendixSection:StaticProps}), -\texttt{DynamicProps} (see -Sec.~\ref{appendixSection:appendixSection:DynamicProps}), -\texttt{Dump2XYZ} (see -Sec.~\ref{appendixSection:appendixSection:Dump2XYZ}), \texttt{Hydro} -(see Sec.~\ref{appendixSection:appendixSection:hydrodynamics}) +\texttt{DynamicProps} (see Sec.~\ref{appendixSection:DynamicProps}), +\texttt{Dump2XYZ} (see Sec.~\ref{appendixSection:Dump2XYZ}), +\texttt{Hydro} (see Sec.~\ref{appendixSection:hydrodynamics}) \textit{etc}. \begin{figure} @@ -91,58 +84,51 @@ solutions succinctly. reusable. They provide a ready-made solution that can be adapted to different problems as necessary. Pattern are expressive. they provide a common vocabulary of solutions that can express large -solutions succinctly. +solutions succinctly. As one of the latest advanced techniques +emerged from object-oriented community, design patterns were applied +in some of the modern scientific software applications, such as +JMol, {\sc OOPSE}\cite{Meineke2005} and PROTOMOL\cite{Matthey2004} +\textit{etc}. The following sections enumerates some of the patterns +used in {\sc OOPSE}. -Patterns are usually described using a format that includes the -following information: -\begin{enumerate} - \item The \emph{name} that is commonly used for the pattern. Good pattern names form a vocabulary for - discussing conceptual abstractions. a pattern may have more than one commonly used or recognizable name - in the literature. In this case it is common practice to document these nicknames or synonyms under - the heading of \emph{Aliases} or \emph{Also Known As}. - \item The \emph{motivation} or \emph{context} that this pattern applies - to. Sometimes, it will include some prerequisites that should be satisfied before deciding to use a pattern - \item The \emph{solution} to the problem that the pattern - addresses. It describes how to construct the necessary work products. The description may include - pictures, diagrams and prose which identify the pattern's structure, its participants, and their - collaborations, to show how the problem is solved. - \item The \emph{consequences} of using the given solution to solve a - problem, both positive and negative. -\end{enumerate} - -As one of the latest advanced techniques emerged from -object-oriented community, design patterns were applied in some of -the modern scientific software applications, such as JMol, {\sc -OOPSE}\cite{Meineke05} and PROTOMOL\cite{Matthey05} \textit{etc}. -The following sections enumerates some of the patterns used in {\sc -OOPSE}. - \subsection{\label{appendixSection:singleton}Singleton} + The Singleton pattern not only provides a mechanism to restrict instantiation of a class to one object, but also provides a global point of access to the object. Currently implemented as a global variable, the logging utility which reports error and warning messages to the console in {\sc OOPSE} is a good candidate for applying the Singleton pattern to avoid the global namespace -pollution.Although the singleton pattern can be implemented in +pollution. Although the singleton pattern can be implemented in various ways to account for different aspects of the software designs, such as lifespan control \textit{etc}, we only use the -static data approach in {\sc OOPSE}. {\tt IntegratorFactory} class -is declared as -\begin{lstlisting}[float,caption={[A classic Singleton design pattern implementation(I)] Declaration of {\tt IntegratorFactory} class.},label={appendixScheme:singletonDeclaration}] +static data approach in {\sc OOPSE}. The declaration and +implementation of IntegratorFactory class are given by declared in +List.~\ref{appendixScheme:singletonDeclaration} and +List.~\ref{appendixScheme:singletonImplementation} respectively. +Since constructor is declared as protected, a client can not +instantiate IntegratorFactory directly. Moreover, since the member +function getInstance serves as the only entry of access to +IntegratorFactory, this approach fulfills the basic requirement, a +single instance. Another consequence of this approach is the +automatic destruction since static data are destroyed upon program +termination. +\begin{lstlisting}[float,caption={[A classic Singleton design pattern implementation(I)] The declaration of of simple Singleton pattern.},label={appendixScheme:singletonDeclaration}] - class IntegratorFactory { - public: - static IntegratorFactory* getInstance(); - protected: - IntegratorFactory(); - private: - static IntegratorFactory* instance_; - }; +class IntegratorFactory { +public: + static IntegratorFactory* + getInstance(); +protected: + IntegratorFactory(); +private: + static IntegratorFactory* instance_; +}; + \end{lstlisting} -The corresponding implementation is -\begin{lstlisting}[float,caption={[A classic Singleton design pattern implementation(II)] Implementation of {\tt IntegratorFactory} class.},label={appendixScheme:singletonImplementation}] +\begin{lstlisting}[float,caption={[A classic implementation of Singleton design pattern (II)] The implementation of simple Singleton pattern.},label={appendixScheme:singletonImplementation}] + IntegratorFactory::instance_ = NULL; IntegratorFactory* getInstance() { @@ -151,136 +137,207 @@ IntegratorFactory* getInstance() { } return instance_; } + \end{lstlisting} -Since constructor is declared as {\tt protected}, a client can not -instantiate {\tt IntegratorFactory} directly. Moreover, since the -member function {\tt getInstance} serves as the only entry of access -to {\tt IntegratorFactory}, this approach fulfills the basic -requirement, a single instance. Another consequence of this approach -is the automatic destruction since static data are destroyed upon -program termination. + \subsection{\label{appendixSection:factoryMethod}Factory Method} Categoried as a creational pattern, the Factory Method pattern deals with the problem of creating objects without specifying the exact class of object that will be created. Factory Method is typically implemented by delegating the creation operation to the subclasses. +Parameterized Factory pattern where factory method ( +createIntegrator member function) creates products based on the +identifier (see List.~\ref{appendixScheme:factoryDeclaration}). If +the identifier has been already registered, the factory method will +invoke the corresponding creator (see +List.~\ref{appendixScheme:integratorCreator}) which utilizes the +modern C++ template technique to avoid excess subclassing. -Registers a creator with a type identifier. Looks up the type -identifier in the internal map. If it is found, it invokes the -corresponding creator for the type identifier and returns its -result. -\begin{lstlisting}[float,caption={[].},label={appendixScheme:factoryDeclaration}] - class IntegratorCreator; - class IntegratorFactory { - public: - typedef std::map CreatorMapType; +\begin{lstlisting}[float,caption={[The implementation of Parameterized Factory pattern (I)]Source code of IntegratorFactory class.},label={appendixScheme:factoryDeclaration}] - bool registerIntegrator(IntegratorCreator* creator); +class IntegratorFactory { +public: + typedef std::map CreatorMapType; - Integrator* createIntegrator(const string& id, SimInfo* info); - - private: - CreatorMapType creatorMap_; - }; -\end{lstlisting} - -\begin{lstlisting}[float,caption={[].},label={appendixScheme:factoryDeclarationImplementation}] - bool IntegratorFactory::unregisterIntegrator(const string& id) { - return creatorMap_.erase(id) == 1; + bool registerIntegrator(IntegratorCreator* creator) { + return creatorMap_.insert(creator->getIdent(), creator).second; } - Integrator* - IntegratorFactory::createIntegrator(const string& id, SimInfo* info) { + Integrator* createIntegrator(const string& id, SimInfo* info) { + Integrator* result = NULL; CreatorMapType::iterator i = creatorMap_.find(id); if (i != creatorMap_.end()) { - //invoke functor to create object - return (i->second)->create(info); - } else { - return NULL; + result = (i->second)->create(info); } + return result; } + +private: + CreatorMapType creatorMap_; +}; \end{lstlisting} -\begin{lstlisting}[float,caption={[].},label={appendixScheme:integratorCreator}] +\begin{lstlisting}[float,caption={[The implementation of Parameterized Factory pattern (III)]Source code of creator classes.},label={appendixScheme:integratorCreator}] - class IntegratorCreator { - public: +class IntegratorCreator { +public: IntegratorCreator(const string& ident) : ident_(ident) {} const string& getIdent() const { return ident_; } virtual Integrator* create(SimInfo* info) const = 0; - private: +private: string ident_; - }; +}; - template - class IntegratorBuilder : public IntegratorCreator { - public: - IntegratorBuilder(const string& ident) : IntegratorCreator(ident) {} - virtual Integrator* create(SimInfo* info) const { - return new ConcreteIntegrator(info); - } - }; +template +class IntegratorBuilder : public IntegratorCreator { +public: + IntegratorBuilder(const string& ident) + : IntegratorCreator(ident) {} + virtual Integrator* create(SimInfo* info) const { + return new ConcreteIntegrator(info); + } +}; \end{lstlisting} \subsection{\label{appendixSection:visitorPattern}Visitor} -The purpose of the Visitor Pattern is to encapsulate an operation -that you want to perform on the elements of a data structure. In -this way, you can change the operation being performed on a -structure without the need of changing the class heirarchy of the -elements that you are operating on. +The visitor pattern is designed to decouple the data structure and +algorithms used upon them by collecting related operation from +element classes into other visitor classes, which is equivalent to +adding virtual functions into a set of classes without modifying +their interfaces. Fig.~\ref{appendixFig:visitorUML} demonstrates the +structure of Visitor pattern which is used extensively in {\tt +Dump2XYZ}. In order to convert an OOPSE dump file, a series of +distinct operations are performed on different StuntDoubles (See the +class hierarchy in Fig.~\ref{oopseFig:hierarchy} and the declaration +in List.~\ref{appendixScheme:element}). Since the hierarchies +remains stable, it is easy to define a visit operation (see +List.~\ref{appendixScheme:visitor}) for each class of StuntDouble. +Note that using Composite pattern\cite{Gamma1994}, CompositVisitor +manages a priority visitor list and handles the execution of every +visitor in the priority list on different StuntDoubles. -\begin{lstlisting}[float,caption={[].},label={appendixScheme:visitor}] - class BaseVisitor{ - public: - virtual void visit(Atom* atom); - virtual void visit(DirectionalAtom* datom); - virtual void visit(RigidBody* rb); - }; +\begin{figure} +\centering +\includegraphics[width=\linewidth]{visitor.eps} +\caption[The UML class diagram of Visitor patten] {The UML class +diagram of Visitor patten.} \label{appendixFig:visitorUML} +\end{figure} + +\begin{figure} +\centering +\includegraphics[width=\linewidth]{hierarchy.eps} +\caption[Class hierarchy for ojects in {\sc OOPSE}]{ A diagram of +the class hierarchy. } \label{oopseFig:hierarchy} +\end{figure} + +\begin{lstlisting}[float,caption={[The implementation of Visitor pattern (II)]Source code of the element classes.},label={appendixScheme:element}] + +class StuntDouble { public: + virtual void accept(BaseVisitor* v) = 0; +}; + +class Atom: public StuntDouble { public: + virtual void accept{BaseVisitor* v*} { + v->visit(this); + } +}; + +class DirectionalAtom: public Atom { public: + virtual void accept{BaseVisitor* v*} { + v->visit(this); + } +}; + +class RigidBody: public StuntDouble { public: + virtual void accept{BaseVisitor* v*} { + v->visit(this); + } +}; + \end{lstlisting} -\begin{lstlisting}[float,caption={[].},label={appendixScheme:element}] - class StuntDouble { - public: - virtual void accept(BaseVisitor* v) = 0; - }; - class Atom: public StuntDouble { - public: - virtual void accept{BaseVisitor* v*} {v->visit(this);} - }; +\begin{lstlisting}[float,caption={[The implementation of Visitor pattern (I)]Source code of the visitor classes.},label={appendixScheme:visitor}] - class DirectionalAtom: public Atom { - public: - virtual void accept{BaseVisitor* v*} {v->visit(this);} - }; +class BaseVisitor{ +public: + virtual void visit(Atom* atom); + virtual void visit(DirectionalAtom* datom); + virtual void visit(RigidBody* rb); +}; - class RigidBody: public StuntDouble { - public: - virtual void accept{BaseVisitor* v*} {v->visit(this);} - }; +class BaseAtomVisitor:public BaseVisitor{ public: + virtual void visit(Atom* atom); + virtual void visit(DirectionalAtom* datom); + virtual void visit(RigidBody* rb); +}; +class SSDAtomVisitor:public BaseAtomVisitor{ public: + virtual void visit(Atom* atom); + virtual void visit(DirectionalAtom* datom); + virtual void visit(RigidBody* rb); +}; + +class CompositeVisitor: public BaseVisitor { +public: + + typedef list > VistorListType; + typedef VistorListType::iterator VisitorListIterator; + virtual void visit(Atom* atom) { + VisitorListIterator i; + BaseVisitor* curVisitor; + for(i = visitorList.begin();i != visitorList.end();++i) { + atom->accept(*i); + } + } + + virtual void visit(DirectionalAtom* datom) { + VisitorListIterator i; + BaseVisitor* curVisitor; + for(i = visitorList.begin();i != visitorList.end();++i) { + atom->accept(*i); + } + } + + virtual void visit(RigidBody* rb) { + VisitorListIterator i; + std::vector myAtoms; + std::vector::iterator ai; + myAtoms = rb->getAtoms(); + for(i = visitorList.begin();i != visitorList.end();++i) {{ + rb->accept(*i); + for(ai = myAtoms.begin(); ai != myAtoms.end(); ++ai){ + (*ai)->accept(*i); + } + } + + void addVisitor(BaseVisitor* v, int priority); + + protected: + VistorListType visitorList; +}; + \end{lstlisting} + \section{\label{appendixSection:concepts}Concepts} OOPSE manipulates both traditional atoms as well as some objects that {\it behave like atoms}. These objects can be rigid collections of atoms or atoms which have orientational degrees of -freedom. Here is a diagram of the class heirarchy: - -%\begin{figure} -%\centering -%\includegraphics[width=3in]{heirarchy.eps} -%\caption[Class heirarchy for StuntDoubles in {\sc oopse}-3.0]{ \\ -%The class heirarchy of StuntDoubles in {\sc oopse}-3.0. The -%selection syntax allows the user to select any of the objects that -%are descended from a StuntDouble.} \label{oopseFig:heirarchy} -%\end{figure} - +freedom. A diagram of the class hierarchy is illustrated in +Fig.~\ref{oopseFig:hierarchy}. Every Molecule, Atom and +DirectionalAtom in {\sc OOPSE} have their own names which are +specified in the {\tt .md} file. In contrast, RigidBodies are +denoted by their membership and index inside a particular molecule: +[MoleculeName]\_RB\_[index] (the contents inside the brackets depend +on the specifics of the simulation). The names of rigid bodies are +generated automatically. For example, the name of the first rigid +body in a DMPC molecule is DMPC\_RB\_0. \begin{itemize} \item A {\bf StuntDouble} is {\it any} object that can be manipulated by the integrators and minimizers. @@ -290,25 +347,20 @@ Every Molecule, Atom and DirectionalAtom in {\sc OOPSE DirectionalAtom}s which behaves as a single unit. \end{itemize} -Every Molecule, Atom and DirectionalAtom in {\sc OOPSE} have their -own names which are specified in the {\tt .md} file. In contrast, -RigidBodies are denoted by their membership and index inside a -particular molecule: [MoleculeName]\_RB\_[index] (the contents -inside the brackets depend on the specifics of the simulation). The -names of rigid bodies are generated automatically. For example, the -name of the first rigid body in a DMPC molecule is DMPC\_RB\_0. - \section{\label{appendixSection:syntax}Syntax of the Select Command} -The most general form of the select command is: {\tt select {\it -expression}}. This expression represents an arbitrary set of -StuntDoubles (Atoms or RigidBodies) in {\sc OOPSE}. Expressions are -composed of either name expressions, index expressions, predefined -sets, user-defined expressions, comparison operators, within -expressions, or logical combinations of the above expression types. -Expressions can be combined using parentheses and the Boolean -operators. +{\sc OOPSE} provides a powerful selection utility to select +StuntDoubles. The most general form of the select command is: +{\tt select {\it expression}}. + +This expression represents an arbitrary set of StuntDoubles (Atoms +or RigidBodies) in {\sc OOPSE}. Expressions are composed of either +name expressions, index expressions, predefined sets, user-defined +expressions, comparison operators, within expressions, or logical +combinations of the above expression types. Expressions can be +combined using parentheses and the Boolean operators. + \subsection{\label{appendixSection:logical}Logical expressions} The logical operators allow complex queries to be constructed out of @@ -440,8 +492,25 @@ of a selected atom or rigid body. and other atoms of type $B$, $g_{AB}(r)$. {\tt StaticProps} can also be used to compute the density distributions of other molecules in a reference frame {\it fixed to the body-fixed reference frame} -of a selected atom or rigid body. +of a selected atom or rigid body. Due to the fact that the selected +StuntDoubles from two selections may be overlapped, {\tt +StaticProps} performs the calculation in three stages which are +illustrated in Fig.~\ref{oopseFig:staticPropsProcess}. +\begin{figure} +\centering +\includegraphics[width=\linewidth]{staticPropsProcess.eps} +\caption[A representation of the three-stage correlations in +\texttt{StaticProps}]{This diagram illustrates three-stage +processing used by \texttt{StaticProps}. $S_1$ and $S_2$ are the +numbers of selected stuntdobules from {\tt -{}-sele1} and {\tt +-{}-sele2} respectively, while $C$ is the number of stuntdobules +appearing at both sets. The first stage($S_1-C$ and $S_2$) and +second stages ($S_1$ and $S_2-C$) are completely non-overlapping. On +the contrary, the third stage($C$ and $C$) are completely +overlapping} \label{oopseFig:staticPropsProcess} +\end{figure} + There are five seperate radial distribution functions availiable in OOPSE. Since every radial distrbution function invlove the calculation between pairs of bodies, {\tt -{}-sele1} and {\tt @@ -485,7 +554,8 @@ distribution functions are most easily seen in the fig \end{description} The vectors (and angles) associated with these angular pair -distribution functions are most easily seen in the figure below: +distribution functions are most easily seen in +Fig.~\ref{oopseFig:gofr} \begin{figure} \centering @@ -496,25 +566,6 @@ Due to the fact that the selected StuntDoubles from tw their body-fixed frames.} \label{oopseFig:gofr} \end{figure} -Due to the fact that the selected StuntDoubles from two selections -may be overlapped, {\tt StaticProps} performs the calculation in -three stages which are illustrated in -Fig.~\ref{oopseFig:staticPropsProcess}. - -\begin{figure} -\centering -\includegraphics[width=\linewidth]{staticPropsProcess.eps} -\caption[A representation of the three-stage correlations in -\texttt{StaticProps}]{This diagram illustrates three-stage -processing used by \texttt{StaticProps}. $S_1$ and $S_2$ are the -numbers of selected stuntdobules from {\tt -{}-sele1} and {\tt --{}-sele2} respectively, while $C$ is the number of stuntdobules -appearing at both sets. The first stage($S_1-C$ and $S_2$) and -second stages ($S_1$ and $S_2-C$) are completely non-overlapping. On -the contrary, the third stage($C$ and $C$) are completely -overlapping} \label{oopseFig:staticPropsProcess} -\end{figure} - The options available for {\tt StaticProps} are as follows: \begin{longtable}[c]{|EFG|} \caption{StaticProps Command-line Options} @@ -577,12 +628,11 @@ sizes in excess of several gigabytes. In order to effe select different types of atoms is already present in the code. For large simulations, the trajectory files can sometimes reach -sizes in excess of several gigabytes. In order to effectively -analyze that amount of data. In order to prevent a situation where -the program runs out of memory due to large trajectories, -\texttt{dynamicProps} will estimate the size of free memory at -first, and determine the number of frames in each block, which -allows the operating system to load two blocks of data +sizes in excess of several gigabytes. In order to prevent a +situation where the program runs out of memory due to large +trajectories, \texttt{dynamicProps} will estimate the size of free +memory at first, and determine the number of frames in each block, +which allows the operating system to load two blocks of data simultaneously without swapping. Upon reading two blocks of the trajectory, \texttt{dynamicProps} will calculate the time correlation within the first block and the cross correlations