diff options
| -rw-r--r-- | report/feedback_1_1.txt | 40 |
1 files changed, 40 insertions, 0 deletions
diff --git a/report/feedback_1_1.txt b/report/feedback_1_1.txt new file mode 100644 index 0000000..6f70af6 --- /dev/null +++ b/report/feedback_1_1.txt @@ -0,0 +1,40 @@ +1) The wording of the introduction is clunky. I suppose that you will agree if you read it out loud. Try to think top-down: what do I want to convey? What elements do I want to highlight? What is the best order of presentation? Do my sentences make sense? Feel free to throw your draft at ChatGPT or a similar tool to suggest different wording. +2) "where the subscript i and j" -> "where the subscripts i, j and k" +3) I wouldn't call this a "technique".. it is simply the definition of the product. +4) You accidentally initialized b instead of c to zero. +5) "comes built-in with an intrinsic function" that is a tautology. +6) Are the "march" and "Xhost" flags not set automatically when -Ok is specified for k>1? Or perhaps k>2? +7) You mention using "system_clock" rather than "cpu_time" because of their different behaviour under multithreading. If you test multithreading, you should do two things: write a few lines about running the algorithm in 2.1 in parallel and include a graph of wall time vs. number of threads for the explicit triple loop implementation so that you can compare the fastest run to a serial but highly optimized run (e.g. ifx with dgemm and -O4). +8) "non-linear increase in wall time" Be as specific as you can. Non-linear is anything not O(n), you know it is O(n^3). +9) In the graph(s) of wall time vs. matrix size, include a line y=x^3 to check the slope. +10) In Fig. 2 (and the text below) I suppose that GCC should be gfortran. +11) Explain the differnce in "row loop" and "column loop" in section 2. Explain the difference in the order of computations but also why you thought it may make a difference in wall time. +12) Try to make figure captions (almost) self-contained. I should be able to read only those, look at the figures and get a pretty good idea what you are reporting. Like it or not, that is how most scientific papers are read. If you grab the readers attention with your figures and captions, they may read the rest. Example: "Fig 4. Wall time versus matrix size for row-major and column major triple loop implementations and the gnu and Intel compilers with 8 threads. Both show $n^3$ scaling for wall times greater than 0.01(s), indicating that this is the overhead of thread forking and joining." You get the idea. +13) '"written" solution' is not clear. I suppose you mean to say that, if the code calls a library like BLAS, the actual computations are done in pre-compiled and optimized code so add further compiler flags does not do much. +14) They to highlight something outstanding in each figure. For instance, in Fig 7, the only significant difference is going from O1 to O2,3,fast. You could try to figure out (by browsing the man pages) what the difference is here (memory management? order of instructions? ...) and comment on that. +15) "relatively coupled" Be careful with words, especially in scientific writing. "Coupled" has a specific meaning an "relative" too. You simply mean to say "similar in magnitude" or so. +16) Refs: when citing a book, try to be more specific. Imagine I want to look up the sourse of something you wrote and I have to read an entire book.. You can add the chapter or section to the citation either in the text ("This was hypothesized by Jones [1990, Chapter 4])." or in the citations as "[6] Jones, A. B., chapter 6 in {\sl Six hypotheses in geometric number theory}, Wiley, 1990. Also, clarify what [4] and [5] are - books, web pages? +17) "A" -> "Appendix A" +18) "gfotran" -> "gfortran" Also, do not specify too many digits here. Three should be enough, no? Unless you ran many tests for each setting and averaged and you really wanted to know the wall times with < 1% error. Omit the explicit "+" or "-" in the exponent of 10. +19) Please include the figures in the repo. I will reproduce some, but not all, of the results and figures. + +_________ +20) When running "make" I get: + +# This target simply compiles the binaries +Compiling serial and parallel binaries with O3 +gfortran matrixproduct.f90 -o bin/gfortran.serial.out -O3 -fexternal-blas -lopenblas -march=native +matrixproduct.f90:47:132: + + 47 | write(error_unit,'(A)') ESC_CHAR // "[31mERROR: Unsupported input (" // trim(temp_in) // ") for loop specification [yes/no]" // ESC_CHAR // "[0m" + | 1 +Error: Line truncated at (1) [-Werror=line-truncation] +matrixproduct.f90:47:103: + + 47 | write(error_unit,'(A)') ESC_CHAR // "[31mERROR: Unsupported input (" // trim(temp_in) // ") for loop specification [yes/no]" // ESC_CHAR // "[0m" + | 1 +Error: Unterminated character constant beginning at (1) +f951: some warnings being treated as errors +make: *** [Makefile:30: all] Error 1 + +Perhaps a subtle difference in gfortran versions? It seems to be a special character in the string that breaks things. I am too tired now to look into it, perhaps you can fix this. |
