trunk/mattDisertation/Introduction.tex



\chapter{\label{chapt:intro}Introduction and Theoretical Background}


\section{\label{introSec:theory}Theoretical Background}

The techniques used in the course of this research fall under the two
main classes of molecular simulation: Molecular Dynamics and Monte
Carlo. Molecular Dynamic simulations integrate the equations of motion
for a given system of particles, allowing the researher to gain
insight into the time dependent evolution of a system. Diffusion
phenomena are readily studied with this simulation technique, making
Molecular Dynamics the main simulation technique used in this
research. Other aspects of the research fall under the Monte Carlo
class of simulations. In Monte Carlo, the configuration space
available to the collection of particles is sampled stochastichally,
or randomly. Each configuration is chosen with a given probability
based on the Maxwell Boltzman distribution. These types of simulations
are best used to probe properties of a system that are only dependent
only on the state of the system. Structural information about a system
is most readily obtained through these types of methods.

Although the two techniques employed seem dissimilar, they are both
linked by the overarching principles of Statistical
Thermodynamics. Statistical Thermodynamics governs the behavior of
both classes of simulations and dictates what each method can and
cannot do. When investigating a system, one most first analyze what
thermodynamic properties of the system are being probed, then chose
which method best suits that objective.

\subsection{\label{introSec:statThermo}Statistical Thermodynamics}

ergodic hypothesis

enesemble averages

\subsection{\label{introSec:monteCarlo}Monte Carlo Simulations}

The Monte Carlo method was developed by Metropolis and Ulam for their
work in fissionable material.\cite{metropolis:1949} The method is so
named, because it heavily uses random numbers in its
solution.\cite{allen87:csl} The Monte Carlo method allows for the
solution of integrals through the stochastic sampling of the values
within the integral. In the simplest case, the evaluation of an
integral would follow a brute force method of
sampling.\cite{Frenkel1996} Consider the following single dimensional
integral:
\begin{equation}
I = f(x)dx
\label{eq:MCex1}
\end{equation}
The equation can be recast as:
\begin{equation}
I = (b-a)\langle f(x) \rangle
\label{eq:MCex2}
\end{equation}
Where $\langle f(x) \rangle$ is the unweighted average over the interval
$[a,b]$. The calculation of the integral could then be solved by
randomly choosing points along the interval $[a,b]$ and calculating
the value of $f(x)$ at each point. The accumulated average would then
approach $I$ in the limit where the number of trials is infintely
large.

However, in Statistical Mechanics, one is typically interested in
integrals of the form:
\begin{equation}
\langle A \rangle = \frac{\int d^N \mathbf{r}~A(\mathbf{r}^N)% 
        e^{-\beta V(\mathbf{r}^N)}}%
        {\int d^N \mathbf{r}~e^{-\beta V(\mathbf{r}^N)}}
\label{eq:mcEnsAvg}
\end{equation}
Where $\mathbf{r}^N$ stands for the coordinates of all $N$ particles
and $A$ is some observable that is only dependent on
position. $\langle A \rangle$ is the ensemble average of $A$ as
presented in Sec.~\ref{introSec:statThermo}. Because $A$ is
independent of momentum, the momenta contribution of the integral can
be factored out, leaving the configurational integral. Application of
the brute force method to this system would yield highly inefficient
results. Due to the Boltzman weighting of this integral, most random
configurations will have a near zero contribution to the ensemble
average. This is where a importance sampling comes into
play.\cite{allen87:csl}

Importance Sampling is a method where one selects a distribution from
which the random configurations are chosen in order to more
efficiently calculate the integral.\cite{Frenkel1996} Consider again
Eq.~\ref{eq:MCex1} rewritten to be:
\begin{equation}
I = \int^b_a \frac{f(x)}{\rho(x)} \rho(x) dx
\label{introEq:Importance1}
\end{equation}
Where $\rho(x)$ is an arbitrary probability distribution in $x$.  If
one conducts $\tau$ trials selecting a random number, $\zeta_\tau$,
from the distribution $\rho(x)$ on the interval $[a,b]$, then
Eq.~\ref{introEq:Importance1} becomes
\begin{equation}
I= \biggl \langle \frac{f(x)}{\rho(x)} \biggr \rangle_{\text{trials}}
\label{introEq:Importance2}
\end{equation}
Looking at Eq.~\ref{eq:mcEnsAvg}, and realizing
\begin {equation}
\rho_{kT}(\mathbf{r}^N) = 
        \frac{e^{-\beta V(\mathbf{r}^N)}}
        {\int d^N \mathbf{r}~e^{-\beta V(\mathbf{r}^N)}}
\label{introEq:MCboltzman}
\end{equation}
Where $\rho_{kT}$ is the boltzman distribution.  The ensemble average
can be rewritten as
\begin{equation}
\langle A \rangle = \int d^N \mathbf{r}~A(\mathbf{r}^N) 
        \rho_{kT}(\mathbf{r}^N)
\label{introEq:Importance3}
\end{equation}
Applying Eq.~\ref{introEq:Importance1} one obtains
\begin{equation}
\langle A \rangle = \biggl \langle
        \frac{ A \rho_{kT}(\mathbf{r}^N) }
        {\rho(\mathbf{r}^N)} \biggr \rangle_{\text{trials}}
\label{introEq:Importance4}
\end{equation}
By selecting $\rho(\mathbf{r}^N)$ to be $\rho_{kT}(\mathbf{r}^N)$
Eq.~\ref{introEq:Importance4} becomes
\begin{equation}
\langle A \rangle = \langle A(\mathbf{r}^N) \rangle_{\text{trials}}
\label{introEq:Importance5}
\end{equation}
The difficulty is selecting points $\mathbf{r}^N$ such that they are
sampled from the distribution $\rho_{kT}(\mathbf{r}^N)$.  A solution
was proposed by Metropolis et al.\cite{metropolis:1953} which involved
the use of a Markov chain whose limiting distribution was
$\rho_{kT}(\mathbf{r}^N)$.

\subsubsection{\label{introSec:markovChains}Markov Chains}

A Markov chain is a chain of states satisfying the following
conditions:\cite{leach01:mm}
\begin{enumerate}
\item The outcome of each trial depends only on the outcome of the previous trial.
\item Each trial belongs to a finite set of outcomes called the state space.
\end{enumerate}
If given two configuartions, $\mathbf{r}^N_m$ and $\mathbf{r}^N_n$,
$\rho_m$ and $\rho_n$ are the probablilities of being in state
$\mathbf{r}^N_m$ and $\mathbf{r}^N_n$ respectively.  Further, the two
states are linked by a transition probability, $\pi_{mn}$, which is the
probability of going from state $m$ to state $n$.

\newcommand{\accMe}{\operatorname{acc}}

The transition probability is given by the following:
\begin{equation}
\pi_{mn} = \alpha_{mn} \times \accMe(m \rightarrow n)
\label{introEq:MCpi}
\end{equation}
Where $\alpha_{mn}$ is the probability of attempting the move $m
\rightarrow n$, and $\accMe$ is the probability of accepting the move
$m \rightarrow n$.  Defining a probability vector,
$\boldsymbol{\rho}$, such that
\begin{equation}
\boldsymbol{\rho} = \{\rho_1, \rho_2, \ldots \rho_m, \rho_n, 
        \ldots \rho_N \}
\label{introEq:MCrhoVector}
\end{equation}
a transition matrix $\boldsymbol{\Pi}$ can be defined,
whose elements are $\pi_{mn}$, for each given transition.  The
limiting distribution of the Markov chain can then be found by
applying the transition matrix an infinite number of times to the
distribution vector.
\begin{equation}
\boldsymbol{\rho}_{\text{limit}} = 
        \lim_{N \rightarrow \infty} \boldsymbol{\rho}_{\text{initial}}
        \boldsymbol{\Pi}^N
\label{introEq:MCmarkovLimit}
\end{equation}
The limiting distribution of the chain is independent of the starting
distribution, and successive applications of the transition matrix
will only yield the limiting distribution again.
\begin{equation}
\boldsymbol{\rho}_{\text{limit}} = \boldsymbol{\rho}_{\text{initial}}
        \boldsymbol{\Pi}
\label{introEq:MCmarkovEquil}
\end{equation}

\subsubsection{\label{introSec:metropolisMethod}The Metropolis Method}

In the Metropolis method\cite{metropolis:1953}
Eq.~\ref{introEq:MCmarkovEquil} is solved such that
$\boldsymbol{\rho}_{\text{limit}}$ matches the Boltzman distribution
of states.  The method accomplishes this by imposing the strong
condition of microscopic reversibility on the equilibrium
distribution.  Meaning, that at equilibrium the probability of going
from $m$ to $n$ is the same as going from $n$ to $m$.
\begin{equation}
\rho_m\pi_{mn} = \rho_n\pi_{nm}
\label{introEq:MCmicroReverse}
\end{equation}
Further, $\boldsymbol{\alpha}$ is chosen to be a symetric matrix in
the Metropolis method.  Using Eq.~\ref{introEq:MCpi},
Eq.~\ref{introEq:MCmicroReverse} becomes
\begin{equation}
\frac{\accMe(m \rightarrow n)}{\accMe(n \rightarrow m)} =
        \frac{\rho_n}{\rho_m}
\label{introEq:MCmicro2}
\end{equation}
For a Boltxman limiting distribution,
\begin{equation}
\frac{\rho_n}{\rho_m} = e^{-\beta[\mathcal{U}(n) - \mathcal{U}(m)]}
        = e^{-\beta \Delta \mathcal{U}}
\label{introEq:MCmicro3}
\end{equation}
This allows for the following set of acceptance rules be defined:
\begin{equation}
EQ Here
\end{equation}

Using the acceptance criteria from Eq.~\ref{fix} the Metropolis method
proceeds as follows
\begin{itemize}
\item Generate an initial configuration $fix$ which has some finite probability in $fix$.
\item Modify $fix$, to generate configuratioon $fix$.
\item If configuration $n$ lowers the energy of the system, accept the move with unity ($fix$ becomes $fix$).  Otherwise accept with probability $fix$.
\item Accumulate the average for the configurational observable of intereest.
\item Repeat from step 2 until average converges.
\end{itemize}
One important note is that the average is accumulated whether the move
is accepted or not, this ensures proper weighting of the average.
Using Eq.~\ref{fix} it becomes clear that the accumulated averages are
the ensemble averages, as this method ensures that the limiting
distribution is the Boltzman distribution.

\subsection{\label{introSec:MD}Molecular Dynamics Simulations}

The main simulation tool used in this research is Molecular Dynamics.
Molecular Dynamics is when the equations of motion for a system are
integrated in order to obtain information about both the positions and
momentum of a system, allowing the calculation of not only
configurational observables, but momenta dependent ones as well:
diffusion constants, velocity auto correlations, folding/unfolding
events, etc.  Due to the principle of ergodicity, Eq.~\ref{fix}, the
average of these observables over the time period of the simulation
are taken to be the ensemble averages for the system.

The choice of when to use molecular dynamics over Monte Carlo
techniques, is normally decided by the observables in which the
researcher is interested.  If the observabvles depend on momenta in
any fashion, then the only choice is molecular dynamics in some form.
However, when the observable is dependent only on the configuration,
then most of the time Monte Carlo techniques will be more efficent.

The focus of research in the second half of this dissertation is
centered around the dynamic properties of phospholipid bilayers,
making molecular dynamics key in the simulation of those properties.

\subsubsection{Molecular dynamics Algorithm}

To illustrate how the molecular dynamics technique is applied, the
following sections will describe the sequence involved in a
simulation.  Sec.~\ref{fix} deals with the initialization of a
simulation.  Sec.~\ref{fix} discusses issues involved with the
calculation of the forces.  Sec.~\ref{fix} concludes the algorithm
discussion with the integration of the equations of motion. \cite{fix}

\subsubsection{initialization}

When selecting the initial configuration for the simulation it is
important to consider what dynamics one is hoping to observe.
Ch.~\ref{fix} deals with the formation and equilibrium dynamics of
phospholipid membranes.  Therefore in these simulations initial
positions were selected that in some cases dispersed the lipids in
water, and in other cases structured the lipids into preformed
bilayers.  Important considerations at this stage of the simulation are:
\begin{itemize}
\item There are no major overlaps of molecular or atomic orbitals
\item Velocities are chosen in such a way as to not gie the system a non=zero total momentum or angular momentum.
\item It is also sometimes desireable to select the velocities to correctly sample the target temperature.
\end{itemize}

The first point is important due to the amount of potential energy
generated by having two particles too close together.  If overlap
occurs, the first evaluation of forces will return numbers so large as
to render the numerical integration of teh motion meaningless.  The
second consideration keeps the system from drifting or rotating as a
whole.  This arises from the fact that most simulations are of systems
in equilibrium in the absence of outside forces.  Therefore any net
movement would be unphysical and an artifact of the simulation method
used.  The final point addresses teh selection of the magnitude of the
initial velocities.  For many simulations it is convienient to use
this opportunity to scale the amount of kinetic energy to reflect the
desired thermal distribution of the system.  However, it must be noted
that most systems will require further velocity rescaling after the
first few initial simulation steps due to either loss or gain of
kinetic energy from energy stored in potential degrees of freedom.

\subsubsection{Force Evaluation}

The evaluation of forces is the most computationally expensive portion
of a given molecular dynamics simulation.  This is due entirely to the
evaluation of long range forces in a simulation, typically pair-wise.
These forces are most commonly the Van der Waals force, and sometimes
Coulombic forces as well.  For a pair-wise force, there are $fix$
pairs to be evaluated, where $n$ is the number of particles in the
system.  This leads to the calculations scaling as $fix$, making large
simulations prohibitive in the absence of any computation saving
techniques.

Another consideration one must resolve, is that in a given simulation
a disproportionate number of the particles will feel the effects of
the surface. \cite{fix} For a cubic system of 1000 particles arranged
in a $10x10x10$ cube, 488 particles will be exposed to the surface.
Unless one is simulating an isolated particle group in a vacuum, the
behavior of the system will be far from the desired bulk
charecteristics.  To offset this, simulations employ the use of
periodic boundary images. \cite{fix}

The technique involves the use of an algorithm that replicates the
simulation box on an infinite lattice in cartesian space.  Any given
particle leaving the simulation box on one side will have an image of
itself enter on the opposite side (see Fig.~\ref{fix}).
\begin{equation}
EQ Here
\end{equation}
In addition, this sets that any given particle pair has an image, real
or periodic, within $fix$ of each other.  A discussion of the method
used to calculate the periodic image can be found in Sec.\ref{fix}.

Returning to the topic of the computational scale of the force
evaluation, the use of periodic boundary conditions requires that a
cutoff radius be employed.  Using a cutoff radius improves the
efficiency of the force evaluation, as particles farther than a
predetermined distance, $fix$, are not included in the
calculation. \cite{fix} In a simultation with periodic images, $fix$
has a maximum value of $fix$.  Fig.~\ref{fix} illustrates how using an
$fix$ larger than this value, or in the extreme limit of no $fix$ at
all, the corners of the simulation box are unequally weighted due to
the lack of particle images in the $x$, $y$, or $z$ directions past a
disance of $fix$.

With the use of an $fix$, however, comes a discontinuity in the
potential energy curve (Fig.~\ref{fix}). To fix this discontinuity,
one calculates the potential energy at the $r_{\text{cut}}$, and add
that value to the potential.  This causes the function to go smoothly
to zero at the cutoff radius.  This ensures conservation of energy
when integrating the Newtonian equations of motion.

The second main simplification used in this research is the Verlet
neighbor list. \cite{allen87:csl} In the Verlet method, one generates
a list of all neighbor atoms, $j$, surrounding atom $i$ within some
cutoff $r_{\text{list}}$, where $r_{\text{list}}>r_{\text{cut}}$.
This list is created the first time forces are evaluated, then on
subsequent force evaluations, pair calculations are only calculated
from the neighbor lists.  The lists are updated if any given particle
in the system moves farther than $r_{\text{list}}-r_{\text{cut}}$,
giving rise to the possibility that a particle has left or joined a
neighbor list.

\subsection{\label{introSec:MDintegrate} Integration of the equations of motion}

A starting point for the discussion of molecular dynamics integrators
is the Verlet algorithm. \cite{Frenkel1996} It begins with a Taylor
expansion of position in time:
\begin{equation}
eq here
\label{introEq:verletForward}
\end{equation}
As well as,
\begin{equation}
eq here
\label{introEq:verletBack}
\end{equation}
Adding together Eq.~\ref{introEq:verletForward} and
Eq.~\ref{introEq:verletBack} results in,
\begin{equation}
eq here
\label{introEq:verletSum}
\end{equation}
Or equivalently,
\begin{equation}
eq here
\label{introEq:verletFinal}
\end{equation}
Which contains an error in the estimate of the new positions on the
order of $\Delta t^4$.

In practice, however, the simulations in this research were integrated
with a velocity reformulation of teh Verlet method. \cite{allen87:csl}
\begin{equation}
eq here
\label{introEq:MDvelVerletPos}
\end{equation}
\begin{equation}
eq here
\label{introEq:MDvelVerletVel}
\end{equation}
The original Verlet algorithm can be regained by substituting the
velocity back into Eq.~\ref{introEq:MDvelVerletPos}.  The Verlet
formulations are chosen in this research because the algorithms have
very little long term drift in energy conservation.  Energy
conservation in a molecular dynamics simulation is of extreme
importance, as it is a measure of how closely one is following the
``true'' trajectory wtih the finite integration scheme.  An exact
solution to the integration will conserve area in phase space, as well
as be reversible in time, that is, the trajectory integrated forward
or backwards will exactly match itself.  Having a finite algorithm
that both conserves area in phase space and is time reversible,
therefore increases, but does not guarantee the ``correctness'' or the
integrated trajectory.

It can be shown, \cite{Frenkel1996} that although the Verlet algorithm
does not rigorously preserve the actual Hamiltonian, it does preserve
a pseudo-Hamiltonian which shadows the real one in phase space.  This
pseudo-Hamiltonian is proveably area-conserving as well as time
reversible.  The fact that it shadows the true Hamiltonian in phase
space is acceptable in actual simulations as one is interested in the
ensemble average of the observable being measured.  From the ergodic
hypothesis (Sec.~\ref{introSec:StatThermo}), it is known that the time
average will match the ensemble average, therefore two similar
trajectories in phase space should give matching statistical averages.

\subsection{\label{introSec:MDfurther}Further Considerations}
In the simulations presented in this research, a few additional
parameters are needed to describe the motions.  The simulations
involving water and phospholipids in Chapt.~\ref{chaptLipids} are
required to integrate the equations of motions for dipoles on atoms.
This involves an additional three parameters be specified for each
dipole atom: $\phi$, $\theta$, and $\psi$.  These three angles are
taken to be the Euler angles, where $\phi$ is a rotation about the
$z$-axis, and $\theta$ is a rotation about the new $x$-axis, and
$\psi$ is a final rotation about the new $z$-axis (see
Fig.~\ref{introFig:euleerAngles}).  This sequence of rotations can be
accumulated into a single $3 \times 3$ matrix $\mathbf{A}$
defined as follows:
\begin{equation}
eq here
\label{introEq:EulerRotMat}
\end{equation}

The equations of motion for Euler angles can be written down as
\cite{allen87:csl}
\begin{equation}
eq here
\label{introEq:MDeuleeerPsi}
\end{equation}
Where $\omega^s_i$ is the angular velocity in the lab space frame
along cartesian coordinate $i$.  However, a difficulty arises when
attempting to integrate Eq.~\ref{introEq:MDeulerPhi} and
Eq.~\ref{introEq:MDeulerPsi}. The $\frac{1}{\sin \theta}$ present in
both equations means there is a non-physical instability present when
$\theta$ is 0 or $\pi$.

To correct for this, the simulations integrate the rotation matrix,
$\mathbf{A}$, directly, thus avoiding the instability.
This method was proposed by Dullwebber
\emph{et. al.}\cite{Dullwebber:1997}, and is presented in
Sec.~\ref{introSec:MDsymplecticRot}.

\subsubsection{\label{introSec:MDliouville}Liouville Propagator}


\section{\label{introSec:chapterLayout}Chapter Layout}

\subsection{\label{introSec:RSA}Random Sequential Adsorption}

\subsection{\label{introSec:OOPSE}The OOPSE Simulation Package}

\subsection{\label{introSec:bilayers}A Mesoscale Model for
Phospholipid Bilayers}
Revision:	979
Committed:	Fri Jan 23 02:18:34 2004 UTC (21 years, 5 months ago) by mmeineke
Content type:	application/x-tex
File size:	21047 byte(s)
Log Message:	a few corrections to the introduction