RANLUX denotes a class of high-quality pseudo-random number generators (PRNGs) with long periods, solid theoretical foundations, and uses in computational physics. These generators are built on top of add-with-carry and subtract-with-borrow PRNGs which allow customization through three user-provided parameters yet there are only two variants in use today working internally with 24-bit and 48-bit arithmetic; this comes at a cost because modern computers do not possess instructions for 24-bit or 48-bit arithmetic. In this blog post I will present the theory behind RANLUX PRNGs, a list of possible parameters for 16-, 32-, and 64-bit RANLUX generators, criteria for selecting generators as well as benchmarks comparing the most competitive RANLUX variants with the RANLUX generators in the C++11 standard library, the Mersenne Twister, and xoshiro128+.
Some of the new RANLUX flavors are easily implemented in C++11 (sometimes in only two lines of code), they are faster than the existing RANLUX flavors, they retain all desirable properties of a RANLUX generator, and they pass all DieHarder and TestU01 BigCrush PRNG tests.
The programs used to compute the RANLUX parameters are called ranlux-tools. They are free software under the terms of the Mozilla Public License 2.0 and can be found online:
Continue reading Faster RANLUX Pseudo-Random Number Generators
A cover tree is a tree data structure used for the partitiong of metric spaces to speed up operations like nearest neighbor, -nearest neighbor, or range searches. In this blog post, I introduce cover trees, their uses, their properties, and I measure the effect of the dimension of the metric space on the run-time in an experiment with synthetic data.
Continue reading Cover Trees
The eigenproblem solver in my master's thesis used SuperLU, a direct solver for the solution of systems of linear equations (SLE) . For the largest test problems, the eigensolver ran out of memory when decomposing the matrix which is why I replaced SuperLU with direct substructuring in an attempt to reduce memory consumption. For this blog post, I measured set-up time, solve time, and memory consumption of SuperLU and direct substructuring with real symmetric positive definite real-world matrices for SLEs with a variable number of right-hand sides, I will highlight that SuperLU was deployed with a suboptimal parameter choice, and why the memory consumption of the decomposition of is the wrong objective function when you want to avoid running out of memory.
Continue reading SuperLU vs Direct Substructuring
xPTEQR is a LAPACK function for the computation of the eigendecomposition of a symmetric positive definite tridiagonal matrix. I compare the performance and accuracy of xPTEQR to the other symmetric eigensolvers in LAPACK 3.5.0.
In layman’s terms, there are several algorithms available on a computer that calculate the eigenvectors and eigenvalues of matrices with special properties (symmetric, positive definite, tridiagonal). xPTEQR is the name of the computer implementation of one of those algorithms and I am not aware of any measurements with it. In this blog post I compare xPTEQR in terms of speed and accuracy to other good algorithms. In the figures below, xPTEQR can be found under the label "SVD" (xPTEQR is the name of the implementation, SVD is the algorithm). Moreover, in every figure it holds that lower is better.
Continue reading Performance and Accuracy of xPTEQR