Integer factorization is a problem that has its roots in the very far history, it is a fascinating problem (to me especially) that has gained a lot of attention in the recent years for cracking the RSA cryptosystem for example.
While having no time to blog about the internals of one of the very efficient methods to factorize relatively large numbers, I present here a simple implementation I have developed of the Quadratic Sieve to factorize relatively big RSA numbers. For a reference about the mathematical premises, the 6th chapter entitled Subexponential Factoring Algorithms of the Prime Numbers, A Computational Perspective book is pretty great.
The program available here is composed of 4 different versions, 3 centralized (sequential) and 1 distributed using MPI. The distributed version needs to be run on at least 2 nodes. The centralized versions consist of different implementation experiments :
- The first one keeps the exponent vector of each smooth number (impractical due to the huge memory needed to keep the vectors).
- The second one sieves at an interval centralized at sqrt(N); according to my testing, this has no visible improvements over the standard one.
- The last version which is the same as the first one but which uses only a binary exponent vector (saving one bit for each prime in the base) for the smooth numbers.
One of the caveats of this implementation is the linear algebra step that performs the Gaussian elimination. The method I am using is the most naive one (though goes pretty fast using XOR operations on GMP integers) which keeps an identity matrix besides the effective matrix while performing the Gaussian elimination, and hence the program needs twice the memory needed for the matrix.
The distributed version spawns as many nodes as needed and makes them all perform the sieving step, after some very hundreds of thousands of steps, the slaves communicate their found smooth numbers to the master which decides if they must stop or should continue sieving for more numbers. Sending GMP integers mpz_t over MPI was very challenging due to the way MPI handles the string representation of these numbers (sometimes '\0' are included at the end and the length returned does not represent exactly the number of characters in the string). You can take a look at the functions doing the MPI send/receive of mpz_t here.
You will need the mpfr library too (for logarithm calculations) in case you want to test the programs, for the distributed version, compile with mpicc and run with mpirun as usual.
Centralized version:Distributed version (60 intel i7 nodes):