summaryrefslogtreecommitdiff
path: root/report/report.tex
diff options
context:
space:
mode:
Diffstat (limited to 'report/report.tex')
-rw-r--r--report/report.tex61
1 files changed, 58 insertions, 3 deletions
diff --git a/report/report.tex b/report/report.tex
index 97dd324..00a8730 100644
--- a/report/report.tex
+++ b/report/report.tex
@@ -6,7 +6,7 @@
\usepackage{xcolor}
\usepackage{graphicx}
-\graphicspath{{./figures/} {./figures/tracy-widom-approx/}}
+\graphicspath{{./figures/} {./figures/tracy-widom-approx/}, {./figures/results/}}
\usepackage{amsmath}
\usepackage{booktabs}
\usepackage{longtable}
@@ -71,7 +71,7 @@ Compared with a normal distribution, it is also difficult to distinguish between
\caption{The TW\textsubscript{1} distribution compared with a normal distribution of arbitrary mean and standard deviation. Note that while the peaks are centered, the Tracy-Widom has a `fatter' tail on the $+x$ side and a slimmer one on the $-x$ side. Both distributions are quite similar visually, so this asymmetry will play an important role in quantifying their differences.}
\label{fig:tracy-widom-vs-normal}
\end{figure}
-Because of their similarities, a smart approach to validating the generation of a Tracy-Widom distribution should include some type of quantifiable, statistical assurance that a normal distribution was not generated.
+Note that the Tracy-Widoms plotted above have no closed-form expression and must be approximated, typically by using a scaled Gamma function \cite{chiani2012Distri}, or more cleverly by downloading some else's finished approximation off of the \textsc{Matlab} file exchange \cite{marco2013}. Because the distributions are for visual purposes only, a smart approach to validating the generation of a Tracy-Widom should include some type of quantifiable, statistical assurance not based on the simple value of the distribution itself.
\subsection{Statistical Differentiation}
There are various ways to go about discerning a Tracy-Widom distribution from a normal one. Perhaps the most elegant is to exploit some basic differences in properties between the two. As mentioned earlier, the Tracy-Widom has a fatter side and a slimmer side, which makes it asymmetric. This is not the case with the normal distribution which is identical on either side of the mean value $\mu$. It turns out that the symmetry of a distribution is readily quantifiable using existing statistical tools, and as such that will be the backbone of the statistical analysis. First consider the \emph{expected value} of a probability density function (PDF) $p(x)$, defined as \cite{siegrist2021}:
@@ -95,7 +95,7 @@ Lastly, the 4th moment is the kurtosis of the distribution. Kurtosis comes from
\begin{equation}
\mathrm{Excess\ Kurtosis} = E\left[ \frac{(x - \mu)^4}{\sigma} \right] - 3
\end{equation}
-These tests, while perhaps simple, will do a fine job in the strict context of differentiating a Tracy-Widom distribution from a normal distribution. In the case of the normal distribution, the skewness and excess kurtosis are both zero. For comparison, the approximate Tracy-Widom has a skewness of $\approx 0.2935$ and an excess kurtosis of $\approx 0.1292$.This was calculated using an Octave script\footnote{\href{https://git.hhmoore.ca/mcsc-6030g/p3-tracy-widom/tree/report/figures/tracy-widom-approx/plot_tracy_widom.m}{Available online} as part of the Git repository for the project.} that sampled 50,000 bins in the range $[-9,9]$ for both distributions. This also verified the calculations as it properly reconstructed the exact mean and standard deviation for the normal distribution sampled from its the closed-form equation.
+These tests, while perhaps simple, will do a fine job in the strict context of differentiating a Tracy-Widom distribution from a normal distribution. In the case of the normal distribution, the skewness and excess kurtosis are both zero. For comparison, the approximate Tracy-Widom has a skewness of $\approx 0.2935$ and an excess kurtosis of $\approx 0.1292$.This was calculated using an Octave script\footnote{\href{https://git.hhmoore.ca/mcsc-6030g/p3-tracy-widom/tree/report/figures/tracy-widom-approx/plot_tracy_widom.m}{Available online} as part of the Git repository for the project.} that sampled 50,000 bins in the range $[-9,9]$ for both distributions. Performing calculations with (1,3,4,5) reconstructed the mean and standard deviation for the normal distribution sampled from its the closed-form equations down to an error of $\sim10^{-5}$.
\section{Numerical Methods}
The natural extension of defining the prerequisite statistical tests to confirm a Tracy-Widom is to actually sample a Tracy-Widom. The issue with this, however, is that it can be quite computationally demanding. The Tracy-Widom is quite similar to the normal except for near the very tails, meaning that most samples (occurring near the peaks) will show agreement between the two. Additionally, it is theorized that the dimension of the matrix $N$ has to be on the order of thousands to get a very Tracy-Widom distribution. The combination of these two points motivates finding a convenient way to calculate the maximum eigenvalue of a large matrix repeated such that thousands of calculations can be performed in a meaningfully short amount of time.
@@ -258,6 +258,61 @@ So far the structure has involved a minimal amount of math. This is only because
These algorithms compose the entirety of the program for reconstructing the Tracy-Widom distributions discussed earlier.
\section{Results}
+Before the eigenvalues can be directly compared to a Tracy-Widom they must be scaled to match the existing distribution. It is possible to perform the scaling empirically, but a closed-form solution is available that makes things easier \cite{konig2005}:
+\begin{equation}
+ \lambda^\prime = \left(\lambda - 2\sqrt{N} \right)\times N^{1/6}
+\end{equation}
+This specific transform is derived from the Wigner semicircular distribution mentioned earlier in section 2. Although it describes the global distribution, taking the extrema yields a bound for the maximum eigenvalue.
+
+\subsection{Tracy-Widom Convergence}
+Following some software issues related to the OneAPI suite, the author was unable to produce enough samples in time to complete the report. However, a moment is appropriate to thank all of the teammates for being kind enough to share their data for analysis in this report. The following numbers are from Joe's runs but all of the analysis was done individually. Samples here are presented with respect to their matrix dimension $N$, number of samples $L$, skewness $S$ and excess kurtosis $K$. Taking, for example, different sample sizes for $N=1750,500$ yields a half-dozen examples of a clear trend emerging where the histograms tend towards a normal-\emph{ish} shape, shown in Figure \ref{fig:randoms}:
+
+\begin{figure}[H]
+ \centering
+ \input{./figures/results/size_pts.tex}
+ \caption{Progression of the Tracy-Widom convergence as a function of matrix size ($N$) and sample size ($L$). Also included are the skewness ($S$) and excess kurtosis ($K$) for each sample. An increase in both sample size and matrix size yields a distribution comparable with but not necessarily closer to the Tracy-Widom reference. Asymmetry is noted especially for the cases with $N=1750$, but with an incorrect sign that implies a fatter tail in the $-x$ direction.}
+ \label{fig:randoms}
+\end{figure}
+
+Despite the visuals, however, the metrics tell a bleaker story. Recall that the Tracy-Widom is authentically constructed with a fatter tail on the right and thinner on the left, which seems to be the opposite of what is presented above. This is also reflected in the skewness assuming a negative value for the curves associated with $N=1750$, implying that the distribution is more favourable to the left of the mean value. As the sample size increases, the shape of the distribution becomes more refined and tends to that of the expected profile, which seems intuitive. It is possible to consider the maximum number of samples available for each matrix size $N$ to confirm whether or not this can yield better results, shown in Figure \ref{fig:big-ones}:
+
+
+\begin{figure}[H]
+ \centering
+ \input{./figures/results/maxcomp.tex}
+ \caption{Reconstructed distributions from the maximum number of samples $L_{max}$ for each size $N$. Table \ref{tbl:results} provides more information on the behaviour of skewness and excess kurtosis for the high-sample runs. Qualitatively, under filling seems prevalent at larger $N$ and a strong $-x$ tail is visible.}
+ \label{fig:big-ones}
+\end{figure}
+
+\begin{table}[H]
+ \centering
+ \caption{Parameters of the Larger-$L$ Distributions for Various $N$}
+ \begin{tabular}{cccc}
+ \toprule
+ N & L & Skewness & Excess Kurtosis \\
+ \midrule
+ 500 & 50,104 & $-8.993\times10^{-3}$ & $8.588\times10^{-1}$ \\
+ 750 & 55,397 & $-5.745\times10^{-2}$ & $7.666\times10^{-1}$ \\
+ 1,000 & 55,536 & $-9.220\times10^{-2}$ & $6.924\times10^{-1}$ \\
+ 1,250 & 55,726 & $-1.185\times10^{-1}$ & $6.616\times10^{-1}$ \\
+ 1,500 & 55,948 & $-1.188\times10^{-1}$ & $4.567\times10^{-1}$ \\
+ 1,750 & 56,097 & $-1.471\times10^{-1}$ & $4.227\times10^{-1}$ \\
+ \bottomrule
+ \end{tabular}
+ \label{tbl:results}
+\end{table}
+
+
+\begin{figure}[H]
+ \centering
+ \input{./figures/results/parameters.tex}
+ \caption{Reconstructed distributions from the maximum number of samples $L_{max}$ for each size $N$. Table \ref{tbl:results} provides more information on the behaviour of skewness and excess kurtosis for the high-sample runs. Qualitatively, under filling seems prevalent at larger $N$ and a strong $-x$ tail is visible.}
+ \label{fig:errors}
+\end{figure}
+
+\subsection{Wall Times}
+
+
\bibliographystyle{ieeetr}
\bibliography{refs.bib}