Integer factorization is a problem that has its roots in the very far history, it is a fascinating problem (to me especially) that has gained a lot of attention in the recent years for cracking the

*RSA*cryptosystem for example.
While having no time to blog about the internals of one of the very efficient methods to factorize relatively large numbers, I present here a simple implementation I have developed of the Quadratic Sieve to factorize relatively big RSA numbers. For a reference about the mathematical premises, the 6th chapter entitled

*of the***Subexponential Factoring Algorithms***book is pretty great.***Prime Numbers, A Computational Perspective**
The program available here is composed of 4 different versions, 3 centralized (sequential) and 1 distributed using

*MPI*. The distributed version needs to be run on at least 2 nodes. The centralized versions consist of different implementation experiments :- The first one keeps the exponent vector of each smooth number (impractical due to the huge memory needed to keep the vectors).
- The second one sieves at an interval centralized at
; according to my testing, this has no visible improvements over the standard one.**sqrt(N)** - The last version which is the same as the first one but which uses only a binary exponent vector (saving one bit for each prime in the base) for the smooth numbers.

One of the caveats of this implementation is the linear algebra step that performs the Gaussian elimination. The method I am using is the most naive one (though goes pretty fast using

*XOR*operations on*GMP*integers) which keeps an*identity matrix*besides the effective matrix while performing the Gaussian elimination, and hence the program needs twice the memory needed for the matrix.
The distributed version spawns as many nodes as needed and makes them all perform the sieving step, after some very hundreds of thousands of steps, the slaves communicate their found smooth numbers to the master which decides if they must stop or should continue sieving for more numbers. Sending

*GMP*integers*over***mpz_t***MPI*was very challenging due to the way*MPI*handles the string representation of these numbers (sometimes*'\0'*are included at the end and the length returned does not represent exactly the number of characters in the string). You can take a look at the functions doing the*MPI*send/receive of*here.***mpz_t**
You will need the mpfr library too (for logarithm calculations) in case you want to test the programs, for the distributed version, compile with

*mpicc*and run with*mpirun*as usual.*Results:*

*Centralized version:*

*Distributed version (60 intel i7 nodes):*