Added worker-manager structure and MPI details

author: Connor Moore <connor@hhmoore.ca> 2026-04-26 01:00:18 -0400
committer: Connor Moore <connor@hhmoore.ca> 2026-04-26 01:00:18 -0400
commit: ee5d0cc547564fde631e02af36a247fde45b6606 (patch)
tree: 622e453ae071ec1af9f793fc453021b4897fd31c
parent: 17368fda50e043eaee453d3ecbe781de2f4faae6 (diff)
2 files changed, 134 insertions, 7 deletions
diff --git a/report/refs.bib b/report/refs.bib
index 196a83e..5fe7b86 100644
--- a/report/refs.bib
+++ b/report/refs.bib
@@ -4,7 +4,7 @@
     year = {2022},
     title = {Numerical Methods for Scientific Computing},
     month = {March},
-    note = {Chapter 4},
+    note = {Chapter 4. \href{https://www.equalsharepress.com/media/NMFSC.pdf}{Available online}.},
     isbn = {9798985421804},
     author = {Kyle A. Novak},
 }
@@ -84,7 +84,7 @@
 @book{siegrist2021,
     year = {2021},
     publisher = {Open Education Resource LibreTexts},
-    note = {Chapter 4. \href{https://stats.libretexts.org/Bookshelves/Probability_Theory/Probability_Mathematical_Statistics_and_Stochastic_Processes_(Siegrist)}{Available online.}},
+    note = {Chapter 4. \href{https://stats.libretexts.org/Bookshelves/Probability_Theory/Probability_Mathematical_Statistics_and_Stochastic_Processes_(Siegrist)}{Available online}.},
     author = {Kyle Siegrist},
     title = {{Probability Theory Probability, Mathematical Statistics, and Stochastic Processes (2021 Edition)}},
 }
 \ No newline at end of file
diff --git a/report/report.tex b/report/report.tex
index 2d73bb4..97dd324 100644
--- a/report/report.tex
+++ b/report/report.tex
@@ -23,6 +23,7 @@
             pdfpagemode=FullScreen}
 
 \usepackage{listings}
+\usepackage{algorithm}
 \usepackage{algpseudocode}
 
 \title{Reconstructing the Tracy-Widom Distribution\vspace{-0.75em}}
@@ -37,7 +38,7 @@ Random numbers form the basis of various studies in mathematics and the sciences
 
 Random matrix theory borrows some distributions from statistics to construct matrices. This includes the Gaussian (normal) distribution, which is used to make various matrices depending on which number system is used. Wigner's work included Orthogonal (real symmetric), Unitary (Complex Hermitian), and Symplectic (Quaternion Hermitian) matrices and their eigenvalues. 
 
-This work considers Gaussian orthogonal ensembles (GOEs) specifically and attempts to reconstruct the Tracy-Widom distribution directly by computing eigenvalues. An overview of relevant theory, the necessary methods, and implementation details is provided before results are presented. The distribution is examined as a function of matrix dimension and sample size, and statistical testing is performed to support the notion of a Tracy-Widom being obtained.
+This work considers Gaussian orthogonal ensembles (GOEs) specifically and attempts to reconstruct the Tracy-Widom distribution directly by computing eigenvalues. An overview of relevant theory, the necessary methods, and implementation details is provided before results are presented. The distribution is examined as a function of matrix dimension and sample size, and statistical testing is performed to support a Tracy-Widom being observed as opposed to a normal distribution.
 
 \section{Relevant Theory}
 
@@ -94,7 +95,7 @@ Lastly, the 4th moment is the kurtosis of the distribution. Kurtosis comes from
 \begin{equation}
     \mathrm{Excess\ Kurtosis} = E\left[ \frac{(x - \mu)^4}{\sigma} \right] - 3
 \end{equation}
-These tests, while perhaps simple, will do a fine job in the strict context of differentiating a Tracy-Widom distribution from a normal distribution. In the case of the normal distribution, the skewness and excess kurtosis are both zero. For comparison, the approximate Tracy-Widom has a skewness of $\approx 0.2935$ and an excess kurtosis of $\approx 0.1292$.This was calculated using an Octave script that sampled 50,000 bins in the range $[-9,9]$ for both distributions. This also verified the calculations as it properly reconstructed the exact mean and standard deviation for the normal distribution sampled from its the closed-form equation.
+These tests, while perhaps simple, will do a fine job in the strict context of differentiating a Tracy-Widom distribution from a normal distribution. In the case of the normal distribution, the skewness and excess kurtosis are both zero. For comparison, the approximate Tracy-Widom has a skewness of $\approx 0.2935$ and an excess kurtosis of $\approx 0.1292$.This was calculated using an Octave script\footnote{\href{https://git.hhmoore.ca/mcsc-6030g/p3-tracy-widom/tree/report/figures/tracy-widom-approx/plot_tracy_widom.m}{Available online} as part of the Git repository for the project.} that sampled 50,000 bins in the range $[-9,9]$ for both distributions. This also verified the calculations as it properly reconstructed the exact mean and standard deviation for the normal distribution sampled from its the closed-form equation.
 
 \section{Numerical Methods}
 The natural extension of defining the prerequisite statistical tests to confirm a Tracy-Widom is to actually sample a Tracy-Widom. The issue with this, however, is that it can be quite computationally demanding. The Tracy-Widom is quite similar to the normal except for near the very tails, meaning that most samples (occurring near the peaks) will show agreement between the two. Additionally, it is theorized that the dimension of the matrix $N$ has to be on the order of thousands to get a very Tracy-Widom distribution. The combination of these two points motivates finding a convenient way to calculate the maximum eigenvalue of a large matrix repeated such that thousands of calculations can be performed in a meaningfully short amount of time.
@@ -118,19 +119,145 @@ One method that guarantees a solution $\forall \mathbf{A}$ is Schur Decompositio
     \mathbf{A} = \mathbf{U} \mathbf{T} \mathbf{U}^\mathrm{T}
 \end{equation}
 In this case, the diagonal entries in $\mathbf{T}$ are then the eigenvalues of the original matrix $\mathbf{A}$. The proof in Schur's theorem is non-constructive and does not provide a solution without knowing the eigenvectors of $\mathbf{A}$, but it can be still be used in combination with the so-called QR algorithm developed independently by John Francis (1934-present) and Vera Kublanovskaya (1920-2012) starting from 1959 \cite{golub2009The_QR}. Essentially, the QR algorithm is the computational scheme that provides the matrices $\mathbf{U}$ and $\mathbf{T}$ from the input $\mathbf{A}$. It does so with a complexity of $\mathcal{O}(n^3)$, which for larger values of $N$ deviates significantly from the $\mathcal{O}(n^2)$ of the power method. For this reason it is used only when the power method fails.
-
 \section{Implementation and MPI Parallelization}
+The last remaining hurdle to computing the eigenvalues is the magnitude of memory required to store such large matrices. Because the requirements will be in the gigabytes range for most calculations, shared-memory parallelism can have issues supporting large-scale calculations needed to accurately define the Tracy-Widom. Because of this, the solution is designed around MPI parallelism with distributed and independent memory between ranks. There are fewer opportunities for race conditions and threads `overstepping' each other in MPI, however care must be taken that the now independent workers can communicate with each other properly.
 
 \subsection{Worker-Manager Structure}
+The program used in this project employed a worker-manager structure to drive the eigenvalue calculations. The first worker is assigned to be a manager and handle the seeding and results collection for the other workers. Everyone else is focused solely on calculating eigenvalues and reporting their results to the manager when finished. This setup allows for the independent calculation of as many eigenvalues as MPI ranks are supported at once, which can approach the order of hundreds or even thousands if compute resources are available on a respectable cluster.
 
 \subsection{The Manager Process}
+The manager process is relatively straightforward and focuses more on the `book keeping' side of the eigenvalue calculations. To demonstrate this, a pseudocode explaining the behaviour of the manager MPI rank is presented below in Algorithm \ref{alg:manager}:
+
+\begin{algorithm}[H]
+\caption{MPI Manager Process}
+\begin{algorithmic}[1] % [1] enables line numbers
+    \State N,L $\gets$ Matrix size and number of data points
+    \State Open logfile and write $N$, $L$ for reference
+    \While{Workers not finished}
+        \For{$i=1:N_{workers}$}
+            \If{Worker $i$ is ready and needed}
+                \State seed $\gets$ \texttt{/dev/urandom}
+                \State \texttt{MPI Broadcast}: send the worker seed
+            \EndIf
+        \EndFor
+        \If{All workers are finished}
+            \State Break loop and finish managing
+        \EndIf
+        \State \texttt{MPI Receive}: wait for a worker to finish calculating...
+        \State Identify which worker is finished calculating and their result
+        \If{Eigenvalue was calculated successfully}
+            \State Tally to the number of successful eigenvalue calculations
+            \State Write the worker number and eigenvalue to logfile
+        \Else
+            \State Tally to the number of failed eigenvalue calculations
+            \State Print a warning message
+        \EndIf
+        \If{Worker is still needed}
+            \State Set worker flag to `Ready'
+        \Else
+            \State Tell worker to conclude
+        \EndIf
+    \EndWhile
+\end{algorithmic}
+\label{alg:manager}
+\end{algorithm}
+
+Note that the actions labelled with involving MPI communication that is blocking, i.e., the worker/manager will wait to receive a broadcast before continuing at all.
 
 \subsection{The Calculation (Worker) Process}
+The worker process involves calculating the eigenvalues and is more complicated than the manager process. Various sub-processes are included in different algorithms, namely for building the matrix, computing the eigenvalue, and applying the power method. The overview of the whole process is outlined in Algorithm \ref{alg:worker}:
 
+\begin{algorithm}[H]
+\caption{MPI Worker Process}
+\begin{algorithmic}[1] % [1] enables line numbers
+    \State size $\gets$ \texttt{MPI Receive}: matrix size
+    \State Initialize matrix space in memory
+    \While{Worker not Finished}
+        \State seed $\gets$ \texttt{MPI Receieve}: new matrix seed
+        \State Call \texttt{open\_prng\_seed}(seed)
+        \State matrix $\gets$ \texttt{build\_random\_matrix}(size)
+        \State eig $\gets$ \texttt{calculate\_eigenvalue}(matrix)
+        \State \texttt{MPI Send}: eig and relevant information
+    \EndWhile
+\end{algorithmic}
+\label{alg:worker}
+\end{algorithm}
+The algorithm for assembling the random matrix given a size is provided below in Algorithm \ref{alg:random-matrix}:
+\begin{algorithm}[H]
+\caption{MPI Worker: Random Matrix Building}
+\begin{algorithmic}[1] % [1] enables line numbers
+\Procedure{build\_random\_matrix}{size}
+\For{$i = 1:$ size}
+    \State \textbf{A}(i,i) = N(0,2)
+    \For{$j = 1:i-1$}
+        \State \textbf{A}(i,j) = N(0,1)
+        \State \textbf{A}(j,i) = \textbf{A}(i,j)
+    \EndFor
+\EndFor
+\State \Return \textbf{A}
+\EndProcedure
+\end{algorithmic}
+\label{alg:random-matrix}
+\end{algorithm}
 
-\section{Results}
+And likewise for the eigenvalue calculation in Algorithm \ref{alg:eigenvalue-calculation}:
+
+\begin{algorithm}[H]
+\caption{MPI Worker: Eigenvalue Calculation}
+\begin{algorithmic}[1] % [1] enables line numbers
+\Procedure{calculate\_eigenvalue}{matrix}
+    \State Initialize convergence tolerance $\varepsilon$ and maximum iterations for power iteration
+    \State Call \texttt{power\_iteration}(matrix, shift=0)
+    \If{power iteration converged}
+        \If{$\lambda < 0$}
+            \State Call \texttt{power\_iteration}(matrix, shift=$\lambda$)
+        \EndIf
+    \Else
+        \State call \texttt{MKL\_schur\_decomposition}(matrix)
+        \If{decomposition successful}
+            \State eig $\gets$ Largest eigenvalue from decomposition
+            \State Set `converged' flag to `true'
+        \Else
+            \State Print warning message \Comment{Not touching the converged flag leaves it `false'}
+        \EndIf
+    \EndIf
+\EndProcedure
+\end{algorithmic}
+\label{alg:eigenvalue-calculation}
+\end{algorithm}
+
+So far the structure has involved a minimal amount of math. This is only because it has been hidden behind abstraction and by using the MKL library to call pre-made functions for computing eigenvalues. The power iteration method was implemented by hand and involves more math as shown in Algorithm \ref{alg:power-iteration}:
 
-%\section{Conclusions}
+\begin{algorithm}[H]
+\caption{MPI Worker: Power Iteration}
+\begin{algorithmic}[1] % [1] enables line numbers
+\Procedure{power\_iteration}{matrix, shift=0}
+\State Initialize a random vector $\mathbf{x}_o$ = N(0,1)
+\State $\mathbf{x}_0 \gets \frac{\mathbf{x}_0}{\left|\mathbf{x}_0\right|}$ \Comment{Normalize}
+\State mem $\gets$ $-100$
+
+\For{$i = 1:$ max\_iterations}
+    \State $\mathbf{y} \gets \mathbf{A}\mathbf{x}_{i-1}$
+    \State $\mathbf{y} \gets \mathbf{y} + $shift$\times\mathbf{x}_{i-1}$
+    \State eig $\gets$ \texttt{dot\_product}($\mathbf{x}_{i}$,$\mathbf{x}_{i-1}$) - shift
+    \State error $\gets\ \frac{\left| \mathrm{mem} - \mathrm{eig}\right|}{\left| \mathrm{mem} \right|}$
+    \If{err $< \varepsilon$}
+        \State Set `converged' flag to `true'
+        \State \Return{eig}
+    \Else
+        \State $\mathbf{x}_i \gets \frac{\mathbf{x}_i}{\left| \mathbf{x}_i \right|}$
+        \State mem $\gets$ eig
+    \EndIf
+\EndFor
+
+\EndProcedure
+\end{algorithmic}
+\label{alg:power-iteration}
+\end{algorithm}
+
+These algorithms compose the entirety of the program for reconstructing the Tracy-Widom distributions discussed earlier.
+
+\section{Results}
 
 \bibliographystyle{ieeetr}
 \bibliography{refs.bib}
author	Connor Moore <connor@hhmoore.ca>	2026-04-26 01:00:18 -0400
committer	Connor Moore <connor@hhmoore.ca>	2026-04-26 01:00:18 -0400
commit	ee5d0cc547564fde631e02af36a247fde45b6606 (patch)
tree	622e453ae071ec1af9f793fc453021b4897fd31c
parent	17368fda50e043eaee453d3ecbe781de2f4faae6 (diff)