This is the continuation of my series “Bibles” in Applied Math. I write a new page for this section because it is a bit off from what most people call “Applied Math”. Further, this topic deserves a blog entry, or even a forum, on its own because being able to make your code run 10-15% faster (or even 90% faster if you haven’t mastered certain level of the art of programming) is a big deal nowadays. It can even land you a lucrative job (here’s an example).
Suppose that you’re a master of Numerical Algorithms. You’ve spent 2 years working on a complex engineering problem and recently have developed a new algorithm that can solve the problem in linear time instead of the state-of-the-art
. Your new solver (and the 2 years) would be wasted if it is poorly written and/or poorly compiled. Here are a couple out of thousands of common poor programming practices.
1. Computing the same thing over and over again
for (int k=0;k<100;k++) {
out[k] = 0;
for (int i=0;i<M;i++)
for (int j=0;j<N;j++)
out[k] += kernel(i,j) * in[k];
}
In this example, kernel(i,j) is re-computed 99 times. Because this matrix Kernel is independent from the outer loop, it can and most of the times should be pre-computed. If the computation of kernel is computationally intensive, which is common, pre-computation of the matrix can make your program run 99.9 times faster.
2. Using the right flags when compiling
[zer0ne@ion]$ g++ myProgram.cpp -o myProgram
[zer0ne@ion]$ ./myProgram
Total time: 68.74 seconds
[zer0ne@ion]$ g++ -O3 myProgram.cpp -o myProgram
[zer0ne@ion]$ ./myProgram
Total time: 13.9 seconds
[zer0ne@ion]$ g++ -O3 -DNDEBUG myProgram.cpp -o myProgram
[zer0ne@ion]$ ./myProgram
Total time: 11.54 seconds
So, if you’re already good at designing algorithms, it’s worth to learn a bit of performance optimization tricks. I am no expert in HPC, let alone related fields such as Computer Architectures or Compilers, so my best bet is to point you to some great books out there. Also for the same reason, I’ll appreciate if you share your own tips.
- Performance Optimization of Numerically Intensive Codes by Hoisie. $73 seems high for a 173-page book but this thin bible is worth every penny. It covers all basic stuffs such as CPU architecture, compiler optimization, memory localit, profiling, etc.
- High Performance Computing by Kevin Dowd & Charles Severance. Similar contents but this book explains things in greater depth.