trunk/mattDisertation/Introduction.tex



\chapter{\label{chapt:intro}Introduction and Theoretical Background}


The techniques used in the course of this research fall under the two
main classes of molecular simulation: Molecular Dynamics and Monte
Carlo. Molecular Dynamic simulations integrate the equations of motion
for a given system of particles, allowing the researcher to gain
insight into the time dependent evolution of a system. Diffusion
phenomena are readily studied with this simulation technique, making
Molecular Dynamics the main simulation technique used in this
research. Other aspects of the research fall under the Monte Carlo
class of simulations. In Monte Carlo, the configuration space
available to the collection of particles is sampled stochastically,
or randomly. Each configuration is chosen with a given probability
based on the Maxwell Boltzmann distribution. These types of simulations
are best used to probe properties of a system that are only dependent
only on the state of the system. Structural information about a system
is most readily obtained through these types of methods.

Although the two techniques employed seem dissimilar, they are both
linked by the overarching principles of Statistical
Thermodynamics. Statistical Thermodynamics governs the behavior of
both classes of simulations and dictates what each method can and
cannot do. When investigating a system, one most first analyze what
thermodynamic properties of the system are being probed, then chose
which method best suits that objective.

\section{\label{introSec:statThermo}Statistical Mechanics}

The following section serves as a brief introduction to some of the
Statistical Mechanics concepts present in this dissertation.  What
follows is a brief derivation of Boltzmann weighted statistics, and an
explanation of how one can use the information to calculate an
observable for a system.  This section then concludes with a brief
discussion of the ergodic hypothesis and its relevance to this
research.

\subsection{\label{introSec:boltzman}Boltzmann weighted statistics}

Consider a system, $\gamma$, with some total energy,, $E_{\gamma}$.
Let $\Omega(E_{\gamma})$ represent the number of degenerate ways
$\boldsymbol{\Gamma}$, the collection of positions and conjugate
momenta of system $\gamma$, can be configured to give
$E_{\gamma}$. Further, if $\gamma$ is in contact with a bath system
where energy is exchanged between the two systems, $\Omega(E)$, where
$E$ is the total energy of both systems, can be represented as
\begin{equation}
\Omega(E) = \Omega(E_{\gamma}) \times \Omega(E - E_{\gamma})
\label{introEq:SM1}
\end{equation}
Or additively as 
\begin{equation}
\ln \Omega(E) = \ln \Omega(E_{\gamma}) + \ln \Omega(E - E_{\gamma})
\label{introEq:SM2}
\end{equation}

The solution to Eq.~\ref{introEq:SM2} maximizes the number of
degenerative configurations in $E$. \cite{Frenkel1996} 
This gives
\begin{equation}
\frac{\partial \ln \Omega(E)}{\partial E_{\gamma}} = 0 =
        \frac{\partial \ln \Omega(E_{\gamma})}{\partial E_{\gamma}}
         + 
        \frac{\partial \ln \Omega(E_{\text{bath}})}{\partial E_{\text{bath}}}
        \frac{\partial E_{\text{bath}}}{\partial E_{\gamma}}
\label{introEq:SM3}
\end{equation}
Where $E_{\text{bath}}$ is $E-E_{\gamma}$, and
$\frac{\partial E_{\text{bath}}}{\partial E_{\gamma}}$ is
$-1$. Eq.~\ref{introEq:SM3} becomes
\begin{equation}
\frac{\partial \ln \Omega(E_{\gamma})}{\partial E_{\gamma}} = 
\frac{\partial \ln \Omega(E_{\text{bath}})}{\partial E_{\text{bath}}}
\label{introEq:SM4}
\end{equation}

At this point, one can draw a relationship between the maximization of
degeneracy in Eq.~\ref{introEq:SM3} and the second law of
thermodynamics.  Namely, that for a closed system, entropy will
increase for an irreversible process.\cite{chandler:1987} Here the
process is the partitioning of energy among the two systems.  This
allows the following definition of entropy:
\begin{equation}
S = k_B \ln \Omega(E)
\label{introEq:SM5}
\end{equation}
Where $k_B$ is the Boltzmann constant.  Having defined entropy, one can
also define the temperature of the system using the relation
\begin{equation}
\frac{1}{T} = \biggl ( \frac{\partial S}{\partial E} \biggr )_{N,V}
\label{introEq:SM6}
\end{equation}
The temperature in the system $\gamma$ is then
\begin{equation}
\beta( E_{\gamma} ) = \frac{1}{k_B T} = 
        \frac{\partial \ln \Omega(E_{\gamma})}{\partial E_{\gamma}}
\label{introEq:SM7}
\end{equation}
Applying this to Eq.~\ref{introEq:SM4} gives the following
\begin{equation}
\beta( E_{\gamma} ) = \beta( E_{\text{bath}} )
\label{introEq:SM8}
\end{equation}
Showing that the partitioning of energy between the two systems is
actually a process of thermal equilibration.\cite{Frenkel1996}

An application of these results is to formulate the form of an
expectation value of an observable, $A$, in the canonical ensemble. In
the canonical ensemble, the number of particles, $N$, the volume, $V$,
and the temperature, $T$, are all held constant while the energy, $E$,
is allowed to fluctuate. Returning to the previous example, the bath
system is now an infinitely large thermal bath, whose exchange of
energy with the system $\gamma$ holds the temperature constant.  The
partitioning of energy in the bath system is then related to the total
energy of both systems and the fluctuations in $E_{\gamma}$:
\begin{equation}
\Omega( E_{\gamma} ) = \Omega( E - E_{\gamma} )
\label{introEq:SM9}
\end{equation}
As for the expectation value, it can be defined
\begin{equation}
\langle A \rangle = 
        \int\limits_{\boldsymbol{\Gamma}} d\boldsymbol{\Gamma} 
        P_{\gamma} A(\boldsymbol{\Gamma})
\label{introEq:SM10}
\end{equation}
Where $\int\limits_{\boldsymbol{\Gamma}} d\boldsymbol{\Gamma}$ denotes
an integration over all accessible phase space, $P_{\gamma}$ is the
probability of being in a given phase state and
$A(\boldsymbol{\Gamma})$ is some observable that is a function of the
phase state.

Because entropy seeks to maximize the number of degenerate states at a
given energy, the probability of being in a particular state in
$\gamma$ will be directly proportional to the number of allowable
states the coupled system is able to assume. Namely,
\begin{equation}
P_{\gamma} \propto \Omega( E_{\text{bath}} ) = 
        e^{\ln \Omega( E - E_{\gamma})}
\label{introEq:SM11}
\end{equation}
With $E_{\gamma} \ll E$, $\ln \Omega$ can be expanded in a Taylor series:
\begin{equation}
\ln \Omega ( E - E_{\gamma}) = 
        \ln \Omega (E) -
        E_{\gamma}  \frac{\partial \ln \Omega }{\partial E}
        + \ldots
\label{introEq:SM12}
\end{equation}
Higher order terms are omitted as $E$ is an infinite thermal
bath. Further, using Eq.~\ref{introEq:SM7}, Eq.~\ref{introEq:SM11} can
be rewritten:
\begin{equation}
P_{\gamma} \propto e^{-\beta E_{\gamma}}
\label{introEq:SM13}
\end{equation}
Where $\ln \Omega(E)$ has been factored out of the proportionality as a
constant.  Normalizing the probability ($\int\limits_{\boldsymbol{\Gamma}}
d\boldsymbol{\Gamma} P_{\gamma} = 1$) gives 
\begin{equation}
P_{\gamma} = \frac{e^{-\beta E_{\gamma}}}
{\int\limits_{\boldsymbol{\Gamma}} d\boldsymbol{\Gamma} e^{-\beta E_{\gamma}}}
\label{introEq:SM14}
\end{equation}
This result is the standard Boltzmann statistical distribution.
Applying it to Eq.~\ref{introEq:SM10} one can obtain the following relation for ensemble averages:
\begin{equation}
\langle A \rangle = 
\frac{\int\limits_{\boldsymbol{\Gamma}} d\boldsymbol{\Gamma} 
        A( \boldsymbol{\Gamma} ) e^{-\beta E_{\gamma}}}
{\int\limits_{\boldsymbol{\Gamma}} d\boldsymbol{\Gamma} e^{-\beta E_{\gamma}}}
\label{introEq:SM15}
\end{equation}

\subsection{\label{introSec:ergodic}The Ergodic Hypothesis}

One last important consideration is that of ergodicity. Ergodicity is
the assumption that given an infinite amount of time, a system will
visit every available point in phase space.\cite{Frenkel1996} For most
systems, this is a valid assumption, except in cases where the system
may be trapped in a local feature (\emph{e.g.}~glasses). When valid,
ergodicity allows the unification of a time averaged observation and
an ensemble averaged one. If an observation is averaged over a
sufficiently long time, the system is assumed to visit all
appropriately available points in phase space, giving a properly
weighted statistical average. This allows the researcher freedom of
choice when deciding how best to measure a given observable.  When an
ensemble averaged approach seems most logical, the Monte Carlo
techniques described in Sec.~\ref{introSec:monteCarlo} can be utilized.
Conversely, if a problem lends itself to a time averaging approach,
the Molecular Dynamics techniques in Sec.~\ref{introSec:MD} can be
employed.

\section{\label{introSec:monteCarlo}Monte Carlo Simulations}

The Monte Carlo method was developed by Metropolis and Ulam for their
work in fissionable material.\cite{metropolis:1949} The method is so
named, because it heavily uses random numbers in its
solution.\cite{allen87:csl} The Monte Carlo method allows for the
solution of integrals through the stochastic sampling of the values
within the integral. In the simplest case, the evaluation of an
integral would follow a brute force method of
sampling.\cite{Frenkel1996} Consider the following single dimensional
integral:
\begin{equation}
I = f(x)dx
\label{eq:MCex1}
\end{equation}
The equation can be recast as:
\begin{equation}
I = (b-a)\langle f(x) \rangle
\label{eq:MCex2}
\end{equation}
Where $\langle f(x) \rangle$ is the unweighted average over the interval
$[a,b]$. The calculation of the integral could then be solved by
randomly choosing points along the interval $[a,b]$ and calculating
the value of $f(x)$ at each point. The accumulated average would then
approach $I$ in the limit where the number of trials is infinitely
large.

However, in Statistical Mechanics, one is typically interested in
integrals of the form:
\begin{equation}
\langle A \rangle = \frac{\int d^N \mathbf{r}~A(\mathbf{r}^N)% 
        e^{-\beta V(\mathbf{r}^N)}}%
        {\int d^N \mathbf{r}~e^{-\beta V(\mathbf{r}^N)}}
\label{eq:mcEnsAvg}
\end{equation}
Where $\mathbf{r}^N$ stands for the coordinates of all $N$ particles
and $A$ is some observable that is only dependent on position. This is
the ensemble average of $A$ as presented in
Sec.~\ref{introSec:statThermo}, except here $A$ is independent of
momentum. Therefore the momenta contribution of the integral can be
factored out, leaving the configurational integral. Application of the
brute force method to this system would yield highly inefficient
results. Due to the Boltzmann weighting of this integral, most random
configurations will have a near zero contribution to the ensemble
average. This is where importance sampling comes into
play.\cite{allen87:csl}

Importance Sampling is a method where one selects a distribution from
which the random configurations are chosen in order to more
efficiently calculate the integral.\cite{Frenkel1996} Consider again
Eq.~\ref{eq:MCex1} rewritten to be:
\begin{equation}
I = \int^b_a \frac{f(x)}{\rho(x)} \rho(x) dx
\label{introEq:Importance1}
\end{equation}
Where $\rho(x)$ is an arbitrary probability distribution in $x$.  If
one conducts $\tau$ trials selecting a random number, $\zeta_\tau$,
from the distribution $\rho(x)$ on the interval $[a,b]$, then
Eq.~\ref{introEq:Importance1} becomes
\begin{equation}
I= \biggl \langle \frac{f(x)}{\rho(x)} \biggr \rangle_{\text{trials}}
\label{introEq:Importance2}
\end{equation}
Looking at Eq.~\ref{eq:mcEnsAvg}, and realizing
\begin {equation}
\rho_{kT}(\mathbf{r}^N) = 
        \frac{e^{-\beta V(\mathbf{r}^N)}}
        {\int d^N \mathbf{r}~e^{-\beta V(\mathbf{r}^N)}}
\label{introEq:MCboltzman}
\end{equation}
Where $\rho_{kT}$ is the Boltzmann distribution.  The ensemble average
can be rewritten as
\begin{equation}
\langle A \rangle = \int d^N \mathbf{r}~A(\mathbf{r}^N) 
        \rho_{kT}(\mathbf{r}^N)
\label{introEq:Importance3}
\end{equation}
Applying Eq.~\ref{introEq:Importance1} one obtains
\begin{equation}
\langle A \rangle = \biggl \langle
        \frac{ A \rho_{kT}(\mathbf{r}^N) }
        {\rho(\mathbf{r}^N)} \biggr \rangle_{\text{trials}}
\label{introEq:Importance4}
\end{equation}
By selecting $\rho(\mathbf{r}^N)$ to be $\rho_{kT}(\mathbf{r}^N)$
Eq.~\ref{introEq:Importance4} becomes
\begin{equation}
\langle A \rangle = \langle A(\mathbf{r}^N) \rangle_{\text{trials}}
\label{introEq:Importance5}
\end{equation}
The difficulty is selecting points $\mathbf{r}^N$ such that they are
sampled from the distribution $\rho_{kT}(\mathbf{r}^N)$.  A solution
was proposed by Metropolis \emph{et al}.\cite{metropolis:1953} which involved
the use of a Markov chain whose limiting distribution was
$\rho_{kT}(\mathbf{r}^N)$.

\subsection{\label{introSec:markovChains}Markov Chains}

A Markov chain is a chain of states satisfying the following
conditions:\cite{leach01:mm}
\begin{enumerate}
\item The outcome of each trial depends only on the outcome of the previous trial.
\item Each trial belongs to a finite set of outcomes called the state space.
\end{enumerate}
If given two configurations, $\mathbf{r}^N_m$ and $\mathbf{r}^N_n$,
$\rho_m$ and $\rho_n$ are the probabilities of being in state
$\mathbf{r}^N_m$ and $\mathbf{r}^N_n$ respectively.  Further, the two
states are linked by a transition probability, $\pi_{mn}$, which is the
probability of going from state $m$ to state $n$.

\newcommand{\accMe}{\operatorname{acc}}

The transition probability is given by the following:
\begin{equation}
\pi_{mn} = \alpha_{mn} \times \accMe(m \rightarrow n)
\label{introEq:MCpi}
\end{equation}
Where $\alpha_{mn}$ is the probability of attempting the move $m
\rightarrow n$, and $\accMe$ is the probability of accepting the move
$m \rightarrow n$.  Defining a probability vector,
$\boldsymbol{\rho}$, such that
\begin{equation}
\boldsymbol{\rho} = \{\rho_1, \rho_2, \ldots \rho_m, \rho_n, 
        \ldots \rho_N \}
\label{introEq:MCrhoVector}
\end{equation}
a transition matrix $\boldsymbol{\Pi}$ can be defined,
whose elements are $\pi_{mn}$, for each given transition.  The
limiting distribution of the Markov chain can then be found by
applying the transition matrix an infinite number of times to the
distribution vector.
\begin{equation}
\boldsymbol{\rho}_{\text{limit}} = 
        \lim_{N \rightarrow \infty} \boldsymbol{\rho}_{\text{initial}}
        \boldsymbol{\Pi}^N
\label{introEq:MCmarkovLimit}
\end{equation}
The limiting distribution of the chain is independent of the starting
distribution, and successive applications of the transition matrix
will only yield the limiting distribution again.
\begin{equation}
\boldsymbol{\rho}_{\text{limit}} = \boldsymbol{\rho}_{\text{initial}}
        \boldsymbol{\Pi}
\label{introEq:MCmarkovEquil}
\end{equation}

\subsection{\label{introSec:metropolisMethod}The Metropolis Method}

In the Metropolis method\cite{metropolis:1953}
Eq.~\ref{introEq:MCmarkovEquil} is solved such that
$\boldsymbol{\rho}_{\text{limit}}$ matches the Boltzmann distribution
of states.  The method accomplishes this by imposing the strong
condition of microscopic reversibility on the equilibrium
distribution.  Meaning, that at equilibrium the probability of going
from $m$ to $n$ is the same as going from $n$ to $m$.
\begin{equation}
\rho_m\pi_{mn} = \rho_n\pi_{nm}
\label{introEq:MCmicroReverse}
\end{equation}
Further, $\boldsymbol{\alpha}$ is chosen to be a symmetric matrix in
the Metropolis method.  Using Eq.~\ref{introEq:MCpi},
Eq.~\ref{introEq:MCmicroReverse} becomes
\begin{equation}
\frac{\accMe(m \rightarrow n)}{\accMe(n \rightarrow m)} =
        \frac{\rho_n}{\rho_m}
\label{introEq:MCmicro2}
\end{equation}
For a Boltzmann limiting distribution,
\begin{equation}
\frac{\rho_n}{\rho_m} = e^{-\beta[\mathcal{U}(n) - \mathcal{U}(m)]}
        = e^{-\beta \Delta \mathcal{U}}
\label{introEq:MCmicro3}
\end{equation}
This allows for the following set of acceptance rules be defined:
\begin{equation}
\accMe( m \rightarrow n ) = 
        \begin{cases}
        1& \text{if $\Delta \mathcal{U} \leq 0$,} \\
        e^{-\beta \Delta \mathcal{U}}& \text{if $\Delta \mathcal{U} > 0$.}
        \end{cases}
\label{introEq:accRules}
\end{equation}

Using the acceptance criteria from Eq.~\ref{introEq:accRules} the
Metropolis method proceeds as follows
\begin{enumerate}
\item Generate an initial configuration $\mathbf{r}^N$ which has some finite probability in $\rho_{kT}$.
\item Modify $\mathbf{r}^N$, to generate configuration $\mathbf{r^{\prime}}^N$.
\item If the new configuration lowers the energy of the system, accept the move with unity ($\mathbf{r}^N$ becomes $\mathbf{r^{\prime}}^N$).  Otherwise accept with probability $e^{-\beta \Delta \mathcal{U}}$.
\item Accumulate the average for the configurational observable of interest.
\item Repeat from step 2 until the average converges.
\end{enumerate}
One important note is that the average is accumulated whether the move
is accepted or not, this ensures proper weighting of the average.
Using Eq.~\ref{introEq:Importance4} it becomes clear that the
accumulated averages are the ensemble averages, as this method ensures
that the limiting distribution is the Boltzmann distribution.

\section{\label{introSec:MD}Molecular Dynamics Simulations}

The main simulation tool used in this research is Molecular Dynamics.
Molecular Dynamics is when the equations of motion for a system are
integrated in order to obtain information about both the positions and
momentum of a system, allowing the calculation of not only
configurational observables, but momenta dependent ones as well:
diffusion constants, velocity auto correlations, folding/unfolding
events, etc.  Due to the principle of ergodicity,
Sec.~\ref{introSec:ergodic}, the average of these observables over the
time period of the simulation are taken to be the ensemble averages
for the system.

The choice of when to use molecular dynamics over Monte Carlo
techniques, is normally decided by the observables in which the
researcher is interested.  If the observables depend on momenta in
any fashion, then the only choice is molecular dynamics in some form.
However, when the observable is dependent only on the configuration,
then most of the time Monte Carlo techniques will be more efficient.

The focus of research in the second half of this dissertation is
centered around the dynamic properties of phospholipid bilayers,
making molecular dynamics key in the simulation of those properties.

\subsection{\label{introSec:mdAlgorithm}The Molecular Dynamics Algorithm}

To illustrate how the molecular dynamics technique is applied, the
following sections will describe the sequence involved in a
simulation.  Sec.~\ref{introSec:mdInit} deals with the initialization
of a simulation.  Sec.~\ref{introSec:mdForce} discusses issues
involved with the calculation of the forces.
Sec.~\ref{introSec:mdIntegrate} concludes the algorithm discussion
with the integration of the equations of motion.\cite{Frenkel1996}

\subsection{\label{introSec:mdInit}Initialization}

When selecting the initial configuration for the simulation it is
important to consider what dynamics one is hoping to observe.
Ch.~\ref{chapt:lipid} deals with the formation and equilibrium dynamics of
phospholipid membranes.  Therefore in these simulations initial
positions were selected that in some cases dispersed the lipids in
water, and in other cases structured the lipids into performed
bilayers.  Important considerations at this stage of the simulation are:
\begin{itemize}
\item There are no major overlaps of molecular or atomic orbitals
\item Velocities are chosen in such a way as to not give the system a non=zero total momentum or angular momentum.
\item It is also sometimes desirable to select the velocities to correctly sample the target temperature.
\end{itemize}

The first point is important due to the amount of potential energy
generated by having two particles too close together.  If overlap
occurs, the first evaluation of forces will return numbers so large as
to render the numerical integration of the motion meaningless.  The
second consideration keeps the system from drifting or rotating as a
whole.  This arises from the fact that most simulations are of systems
in equilibrium in the absence of outside forces.  Therefore any net
movement would be unphysical and an artifact of the simulation method
used.  The final point addresses the selection of the magnitude of the
initial velocities.  For many simulations it is convenient to use
this opportunity to scale the amount of kinetic energy to reflect the
desired thermal distribution of the system.  However, it must be noted
that most systems will require further velocity rescaling after the
first few initial simulation steps due to either loss or gain of
kinetic energy from energy stored in potential degrees of freedom.

\subsection{\label{introSec:mdForce}Force Evaluation}

The evaluation of forces is the most computationally expensive portion
of a given molecular dynamics simulation.  This is due entirely to the
evaluation of long range forces in a simulation, typically pair-wise.
These forces are most commonly the Van der Waals force, and sometimes
Coulombic forces as well.  For a pair-wise force, there are $N(N-1)/ 2$
pairs to be evaluated, where $N$ is the number of particles in the
system.  This leads to the calculations scaling as $N^2$, making large
simulations prohibitive in the absence of any computation saving
techniques.

Another consideration one must resolve, is that in a given simulation
a disproportionate number of the particles will feel the effects of
the surface.\cite{allen87:csl} For a cubic system of 1000 particles
arranged in a $10 \times 10 \times 10$ cube, 488 particles will be
exposed to the surface.  Unless one is simulating an isolated particle
group in a vacuum, the behavior of the system will be far from the
desired bulk characteristics.  To offset this, simulations employ the
use of periodic boundary images.\cite{born:1912}

The technique involves the use of an algorithm that replicates the
simulation box on an infinite lattice in Cartesian space.  Any given
particle leaving the simulation box on one side will have an image of
itself enter on the opposite side (see Fig.~\ref{introFig:pbc}).  In
addition, this sets that any two particles have an image, real or
periodic, within $\text{box}/2$ of each other.  A discussion of the
method used to calculate the periodic image can be found in
Sec.\ref{oopseSec:pbc}.

\begin{figure}
\centering
\includegraphics[width=\linewidth]{pbcFig.eps}
\caption[An illustration of periodic boundary conditions]{A 2-D illustration of periodic boundary conditions. As one particle leaves the right of the simulation box, an image of it enters the left.}
\label{introFig:pbc}
\end{figure}

Returning to the topic of the computational scale of the force
evaluation, the use of periodic boundary conditions requires that a
cutoff radius be employed.  Using a cutoff radius improves the
efficiency of the force evaluation, as particles farther than a
predetermined distance, $r_{\text{cut}}$, are not included in the
calculation.\cite{Frenkel1996} In a simulation with periodic images,
$r_{\text{cut}}$ has a maximum value of $\text{box}/2$.
Fig.~\ref{introFig:rMax} illustrates how when using an
$r_{\text{cut}}$ larger than this value, or in the extreme limit of no
$r_{\text{cut}}$ at all, the corners of the simulation box are
unequally weighted due to the lack of particle images in the $x$, $y$,
or $z$ directions past a distance of $\text{box} / 2$.

\begin{figure}
\centering
\includegraphics[width=\linewidth]{rCutMaxFig.eps}
\caption[An explanation of $r_{\text{cut}}$]{The yellow atom has all other images wrapped to itself as the center. If $r_{\text{cut}}=\text{box}/2$, then the distribution is uniform (blue atoms). However, when $r_{\text{cut}}>\text{box}/2$ the corners are disproportionately weighted (green atoms) vs the axial directions (shaded regions).}
\label{introFig:rMax}
\end{figure}

With the use of an $r_{\text{cut}}$, however, comes a discontinuity in
the potential energy curve (Fig.~\ref{introFig:shiftPot}). To fix this
discontinuity, one calculates the potential energy at the
$r_{\text{cut}}$, and adds that value to the potential, causing
the function to go smoothly to zero at the cutoff radius.  This
shifted potential ensures conservation of energy when integrating the
Newtonian equations of motion.

\begin{figure}
\centering
\includegraphics[width=\linewidth]{shiftedPot.eps}
\caption[Shifting the Lennard-Jones Potential]{The Lennard-Jones potential (blue line) is shifted (red line) to remove the discontinuity at $r_{\text{cut}}$.}
\label{introFig:shiftPot}
\end{figure}

The second main simplification used in this research is the Verlet
neighbor list. \cite{allen87:csl} In the Verlet method, one generates
a list of all neighbor atoms, $j$, surrounding atom $i$ within some
cutoff $r_{\text{list}}$, where $r_{\text{list}}>r_{\text{cut}}$.
This list is created the first time forces are evaluated, then on
subsequent force evaluations, pair calculations are only calculated
from the neighbor lists.  The lists are updated if any given particle
in the system moves farther than $r_{\text{list}}-r_{\text{cut}}$,
giving rise to the possibility that a particle has left or joined a
neighbor list.

\subsection{\label{introSec:mdIntegrate} Integration of the equations of motion}

A starting point for the discussion of molecular dynamics integrators
is the Verlet algorithm.\cite{Frenkel1996} It begins with a Taylor
expansion of position in time:
\begin{equation}
q(t+\Delta t)= q(t) + v(t)\Delta t + \frac{F(t)}{2m}\Delta t^2 +
        \frac{\Delta t^3}{3!}\frac{\partial q(t)}{\partial t} +
        \mathcal{O}(\Delta t^4) 
\label{introEq:verletForward}
\end{equation}
As well as,
\begin{equation}
q(t-\Delta t)= q(t) - v(t)\Delta t + \frac{F(t)}{2m}\Delta t^2 -
        \frac{\Delta t^3}{3!}\frac{\partial q(t)}{\partial t} +
        \mathcal{O}(\Delta t^4) 
\label{introEq:verletBack}
\end{equation}
Where $m$ is the mass of the particle, $q(t)$ is the position at time
$t$, $v(t)$ the velocity, and $F(t)$ the force acting on the
particle. Adding together Eq.~\ref{introEq:verletForward} and
Eq.~\ref{introEq:verletBack} results in,
\begin{equation}
q(t+\Delta t)+q(t-\Delta t) = 
        2q(t) + \frac{F(t)}{m}\Delta t^2 + \mathcal{O}(\Delta t^4) 
\label{introEq:verletSum}
\end{equation}
Or equivalently,
\begin{equation}
q(t+\Delta t) \approx 
        2q(t) - q(t-\Delta t) + \frac{F(t)}{m}\Delta t^2
\label{introEq:verletFinal}
\end{equation}
Which contains an error in the estimate of the new positions on the
order of $\Delta t^4$.

In practice, however, the simulations in this research were integrated
with a velocity reformulation of the Verlet method.\cite{allen87:csl}
\begin{align}
q(t+\Delta t) &= q(t) + v(t)\Delta t + \frac{F(t)}{2m}\Delta t^2 %
\label{introEq:MDvelVerletPos} \\%
%
v(t+\Delta t) &= v(t) + \frac{\Delta t}{2m}[F(t) + F(t+\Delta t)] %
\label{introEq:MDvelVerletVel}
\end{align}
The original Verlet algorithm can be regained by substituting the
velocity back into Eq.~\ref{introEq:MDvelVerletPos}.  The Verlet
formulations are chosen in this research because the algorithms have
very little long term drift in energy conservation.  Energy
conservation in a molecular dynamics simulation is of extreme
importance, as it is a measure of how closely one is following the
``true'' trajectory with the finite integration scheme.  An exact
solution to the integration will conserve area in phase space, as well
as be reversible in time, that is, the trajectory integrated forward
or backwards will exactly match itself.  Having a finite algorithm
that both conserves area in phase space and is time reversible,
therefore increases, but does not guarantee the ``correctness'' or the
integrated trajectory.

It can be shown,\cite{Frenkel1996} that although the Verlet algorithm
does not rigorously preserve the actual Hamiltonian, it does preserve
a pseudo-Hamiltonian which shadows the real one in phase space.  This
pseudo-Hamiltonian is provably area-conserving as well as time
reversible.  The fact that it shadows the true Hamiltonian in phase
space is acceptable in actual simulations as one is interested in the
ensemble average of the observable being measured.  From the ergodic
hypothesis (Sec.~\ref{introSec:statThermo}), it is known that the time
average will match the ensemble average, therefore two similar
trajectories in phase space should give matching statistical averages.

\subsection{\label{introSec:MDfurther}Further Considerations}

In the simulations presented in this research, a few additional
parameters are needed to describe the motions.  The simulations
involving water and phospholipids in Ch.~\ref{chapt:lipid} are
required to integrate the equations of motions for dipoles on atoms.
This involves an additional three parameters be specified for each
dipole atom: $\phi$, $\theta$, and $\psi$.  These three angles are
taken to be the Euler angles, where $\phi$ is a rotation about the
$z$-axis, and $\theta$ is a rotation about the new $x$-axis, and
$\psi$ is a final rotation about the new $z$-axis (see
Fig.~\ref{introFig:eulerAngles}).  This sequence of rotations can be
accumulated into a single $3 \times 3$ matrix, $\mathbf{A}$,
defined as follows:
\begin{equation}
\mathbf{A} = 
\begin{bmatrix}
        \cos\phi\cos\psi-\sin\phi\cos\theta\sin\psi &%
        \sin\phi\cos\psi+\cos\phi\cos\theta\sin\psi &%
        \sin\theta\sin\psi \\%
        %
        -\cos\phi\sin\psi-\sin\phi\cos\theta\cos\psi &%
        -\sin\phi\sin\psi+\cos\phi\cos\theta\cos\psi &%
        \sin\theta\cos\psi \\%
        %
        \sin\phi\sin\theta &%
        -\cos\phi\sin\theta &%
        \cos\theta
\end{bmatrix}
\label{introEq:EulerRotMat}
\end{equation}

\begin{figure}
\centering
\includegraphics[width=\linewidth]{eulerRotFig.eps}
\caption[Euler rotation of Cartesian coordinates]{The rotation scheme for Euler angles. First is a rotation of $\phi$ about the $z$ axis (blue rotation). Next is a rotation of $\theta$ about the new $x\prime$ axis (green rotation). Lastly is a final rotation of $\psi$ about the new $z\prime$ axis (red rotation).}
\label{introFig:eulerAngles}
\end{figure}

The equations of motion for Euler angles can be written down
as\cite{allen87:csl}
\begin{align}
\dot{\phi} &= -\omega^s_x \frac{\sin\phi\cos\theta}{\sin\theta} +
        \omega^s_y \frac{\cos\phi\cos\theta}{\sin\theta} +
        \omega^s_z
\label{introEq:MDeulerPhi} \\%
%
\dot{\theta} &= \omega^s_x \cos\phi + \omega^s_y \sin\phi
\label{introEq:MDeulerTheta} \\%
%
\dot{\psi} &= \omega^s_x \frac{\sin\phi}{\sin\theta} - 
        \omega^s_y \frac{\cos\phi}{\sin\theta}
\label{introEq:MDeulerPsi}
\end{align}
Where $\omega^s_i$ is the angular velocity in the lab space frame
along Cartesian coordinate $i$.  However, a difficulty arises when
attempting to integrate Eq.~\ref{introEq:MDeulerPhi} and
Eq.~\ref{introEq:MDeulerPsi}. The $\frac{1}{\sin \theta}$ present in
both equations means there is a non-physical instability present when
$\theta$ is 0 or $\pi$. To correct for this, the simulations integrate
the rotation matrix, $\mathbf{A}$, directly, thus avoiding the
instability.  This method was proposed by Dullweber
\emph{et. al.}\cite{Dullweber1997}, and is presented in
Sec.~\ref{introSec:MDsymplecticRot}.

\subsection{\label{introSec:MDliouville}Liouville Propagator}

Before discussing the integration of the rotation matrix, it is
necessary to understand the construction of a ``good'' integration
scheme.  It has been previously
discussed(Sec.~\ref{introSec:mdIntegrate}) how it is desirable for an
integrator to be symplectic, or time reversible.  The following is an
outline of the Trotter factorization of the Liouville Propagator as a
scheme for generating symplectic integrators.\cite{Tuckerman92}

For a system with $f$ degrees of freedom the Liouville operator can be
defined as,
\begin{equation}
iL=\sum^f_{j=1} \biggl [\dot{q}_j\frac{\partial}{\partial q_j} +
        F_j\frac{\partial}{\partial p_j} \biggr ]
\label{introEq:LiouvilleOperator}
\end{equation}
Here, $q_j$ and $p_j$ are the position and conjugate momenta of a
degree of freedom, and $F_j$ is the force on that degree of freedom.
$\Gamma$ is defined as the set of all positions and conjugate momenta,
$\{q_j,p_j\}$, and the propagator, $U(t)$, is defined
\begin {equation}
U(t) = e^{iLt}
\label{introEq:Lpropagator}
\end{equation}
This allows the specification of $\Gamma$ at any time $t$ as 
\begin{equation}
\Gamma(t) = U(t)\Gamma(0)
\label{introEq:Lp2}
\end{equation}
It is important to note, $U(t)$ is a unitary operator meaning
\begin{equation}
U(-t)=U^{-1}(t)
\label{introEq:Lp3}
\end{equation}

Decomposing $L$ into two parts, $iL_1$ and $iL_2$, one can use the
Trotter theorem to yield
\begin{align}
e^{iLt} &= e^{i(L_1 + L_2)t} \notag \\%
%
        &= \biggl [ e^{i(L_1 +L_2)\frac{t}{P}} \biggr]^P \notag \\%
%
        &= \biggl [ e^{iL_1\frac{\Delta t}{2}}\, e^{iL_2\Delta t}\,
        e^{iL_1\frac{\Delta t}{2}} \biggr ]^P +
        \mathcal{O}\biggl (\frac{t^3}{P^2} \biggr ) \label{introEq:Lp4}
\end{align}
Where $\Delta t = t/P$.
With this, a discrete time operator $G(\Delta t)$ can be defined:
\begin{align}
G(\Delta t) &= e^{iL_1\frac{\Delta t}{2}}\, e^{iL_2\Delta t}\,
        e^{iL_1\frac{\Delta t}{2}} \notag \\%
%
        &= U_1 \biggl ( \frac{\Delta t}{2} \biggr )\, U_2 ( \Delta t )\,
        U_1 \biggl ( \frac{\Delta t}{2} \biggr )
\label{introEq:Lp5}
\end{align}
Because $U_1(t)$ and $U_2(t)$ are unitary, $G(\Delta t)$ is also
unitary.  Meaning an integrator based on this factorization will be
reversible in time.

As an example, consider the following decomposition of $L$:
\begin{align}
iL_1 &= \dot{q}\frac{\partial}{\partial q}%
\label{introEq:Lp6a} \\%
%
iL_2 &= F(q)\frac{\partial}{\partial p}%
\label{introEq:Lp6b}
\end{align}
This leads to propagator $G( \Delta t )$ as,
\begin{equation}
G(\Delta t) =  e^{\frac{\Delta t}{2} F(q)\frac{\partial}{\partial p}} \,
        e^{\Delta t\,\dot{q}\frac{\partial}{\partial q}} \,
        e^{\frac{\Delta t}{2} F(q)\frac{\partial}{\partial p}}
\label{introEq:Lp7}
\end{equation}
Operating $G(\Delta t)$ on $\Gamma(0)$, and utilizing the operator property
\begin{equation}
e^{c\frac{\partial}{\partial x}}\, f(x) = f(x+c)
\label{introEq:Lp8}
\end{equation}
Where $c$ is independent of $x$.  One obtains the following:  
\begin{align}
\dot{q}\biggl (\frac{\Delta t}{2}\biggr ) &=
        \dot{q}(0) + \frac{\Delta t}{2m}\, F[q(0)] \label{introEq:Lp9a}\\%
%
q(\Delta t) &= q(0) + \Delta t\, \dot{q}\biggl (\frac{\Delta t}{2}\biggr )%
        \label{introEq:Lp9b}\\%
%
\dot{q}(\Delta t) &= \dot{q}\biggl (\frac{\Delta t}{2}\biggr ) +
        \frac{\Delta t}{2m}\, F[q(0)] \label{introEq:Lp9c}
\end{align}
Or written another way,
\begin{align}
q(t+\Delta t) &= q(0) + \dot{q}(0)\Delta t + 
        \frac{F[q(0)]}{m}\frac{\Delta t^2}{2} %
\label{introEq:Lp10a} \\%
%
\dot{q}(\Delta t) &= \dot{q}(0) + \frac{\Delta t}{2m}
        \biggl [F[q(0)] + F[q(\Delta t)] \biggr] %
\label{introEq:Lp10b}
\end{align}
This is the velocity Verlet formulation presented in
Sec.~\ref{introSec:mdIntegrate}.  Because this integration scheme is
comprised of unitary propagators, it is symplectic, and therefore area
preserving in phase space.  From the preceding factorization, one can
see that the integration of the equations of motion would follow:
\begin{enumerate}
\item calculate the velocities at the half step, $\frac{\Delta t}{2}$, from the forces calculated at the initial position.

\item Use the half step velocities to move positions one whole step, $\Delta t$.

\item Evaluate the forces at the new positions, $\mathbf{r}(\Delta t)$, and use the new forces to complete the velocity move.

\item Repeat from step 1 with the new position, velocities, and forces assuming the roles of the initial values.
\end{enumerate}

\subsection{\label{introSec:MDsymplecticRot} Symplectic Propagation of the Rotation Matrix}

Based on the factorization from the previous section,
Dullweber\emph{et al}.\cite{Dullweber1997}~ proposed a scheme for the
symplectic propagation of the rotation matrix, $\mathbf{A}$, as an
alternative method for the integration of orientational degrees of
freedom. The method starts with a straightforward splitting of the
Liouville operator:
\begin{align}
iL_{\text{pos}} &= \dot{q}\frac{\partial}{\partial q} +
        \mathbf{\dot{A}}\frac{\partial}{\partial \mathbf{A}} 
\label{introEq:SR1a} \\%
%
iL_F &= F(q)\frac{\partial}{\partial p}
\label{introEq:SR1b} \\%
iL_{\tau} &= \tau(\mathbf{A})\frac{\partial}{\partial \pi}
\label{introEq:SR1b} \\%
\end{align}
Where $\tau(\mathbf{A})$ is the torque of the system
due to the configuration, and $\pi$ is the conjugate
angular momenta of the system. The propagator, $G(\Delta t)$, becomes
\begin{equation}
G(\Delta t) = e^{\frac{\Delta t}{2} F(q)\frac{\partial}{\partial p}} \,
        e^{\frac{\Delta t}{2} \tau(\mathbf{A})\frac{\partial}{\partial \pi}} \,
        e^{\Delta t\,iL_{\text{pos}}} \,
        e^{\frac{\Delta t}{2} \tau(\mathbf{A})\frac{\partial}{\partial \pi}} \,
        e^{\frac{\Delta t}{2} F(q)\frac{\partial}{\partial p}}
\label{introEq:SR2}
\end{equation}
Propagation of the linear and angular momenta follows as in the Verlet
scheme.  The propagation of positions also follows the Verlet scheme
with the addition of a further symplectic splitting of the rotation
matrix propagation, $\mathcal{U}_{\text{rot}}(\Delta t)$, within
$U_{\text{pos}}(\Delta t)$.
\begin{equation}
\mathcal{U}_{\text{rot}}(\Delta t) = 
        \mathcal{U}_x \biggl(\frac{\Delta t}{2}\biggr)\,
        \mathcal{U}_y \biggl(\frac{\Delta t}{2}\biggr)\,
        \mathcal{U}_z (\Delta t)\,
        \mathcal{U}_y \biggl(\frac{\Delta t}{2}\biggr)\,
        \mathcal{U}_x \biggl(\frac{\Delta t}{2}\biggr)\,
\label{introEq:SR3}
\end{equation}
Where $\mathcal{U}_j$ is a unitary rotation of $\mathbf{A}$ and
$\pi$ about each axis $j$.  As all propagations are now
unitary and symplectic, the entire integration scheme is also
symplectic and time reversible.

\section{\label{introSec:layout}Dissertation Layout}

This dissertation is divided as follows:Ch.~\ref{chapt:RSA}
presents the random sequential adsorption simulations of related
pthalocyanines on a gold (111) surface. Ch.~\ref{chapt:OOPSE}
is about the writing of the molecular dynamics simulation package
{\sc oopse}. Ch.~\ref{chapt:lipid} regards the simulations of
phospholipid bilayers using a mesoscale model. And lastly,
Ch.~\ref{chapt:conclusion} concludes this dissertation with a
summary of all results. The chapters are arranged in chronological
order, and reflect the progression of techniques I employed during my
research.  

The chapter concerning random sequential adsorption simulations is a
study in applying Statistical Mechanics simulation techniques in order
to obtain a simple model capable of explaining the results.  My
advisor, Dr. Gezelter, and I were approached by a colleague,
Dr. Lieberman, about possible explanations for the  partial coverage of a
gold surface by a particular compound of hers. We suggested it might
be due to the statistical packing fraction of disks on a plane, and
set about to simulate this system.  As the events in our model were
not dynamic in nature, a Monte Carlo method was employed.  Here, if a
molecule landed on the surface without overlapping another, then its
landing was accepted.  However, if there was overlap, the landing we
rejected and a new random landing location was chosen.  This defined
our acceptance rules and allowed us to construct a Markov chain whose
limiting distribution was the surface coverage in which we were
interested.

The following chapter, about the simulation package {\sc oopse},
describes in detail the large body of scientific code that had to be
written in order to study phospholipid bilayers.  Although there are
pre-existing molecular dynamic simulation packages available, none
were capable of implementing the models we were developing.{\sc oopse}
is a unique package capable of not only integrating the equations of
motion in Cartesian space, but is also able to integrate the
rotational motion of rigid bodies and dipoles.  Add to this the
ability to perform calculations across parallel processors and a
flexible script syntax for creating systems, and {\sc oopse} becomes a
very powerful scientific instrument for the exploration of our model.

Bringing us to Ch.~\ref{chapt:lipid}. Using {\sc oopse}, I have been
able to parameterize a mesoscale model for phospholipid simulations.
This model retains information about solvent ordering around the
bilayer, as well as information regarding the interaction of the
phospholipid head groups' dipoles with each other and the surrounding
solvent.  These simulations give us insight into the dynamic events
that lead to the formation of phospholipid bilayers, as well as
provide the foundation for future exploration of bilayer phase
behavior with this model.  

Which leads into the last chapter, where I discuss future directions
for both{\sc oopse} and this mesoscale model.  Additionally, I will
give a summary of results for this dissertation.


Revision:	1014
Committed:	Tue Feb 3 21:30:23 2004 UTC (20 years, 5 months ago) by mmeineke
Content type:	application/x-tex
File size:	40751 byte(s)
Log Message:	little fixes