A Sieve Method For Factoring Numbers of The Form n2 + L
A Sieve Method For Factoring Numbers of The Form n2 + L
of the Form n2 + l
By Daniel Shanks
1. Introduction. Factorizations of numbers of the form n + 1, [1], [2], are of
mathematical interest for at least four reasons: 1) a possible insight into the un-
settled question as to whether there are infinitely many primes of this form; 2) a
possible insight into the unsettled question as to whether the reducible numbers
(explained below) have a definite density; 3) the relation of these factorizations
to the p-adic square roots of —1; and 4) the relation of these factorizations to the
Gaussian primes. The purpose of this paper is to describe a sieve method for factor-
ing these numbers and to present and discuss some empirical results bearing on
1), 2), 3), and 4) which were obtained by its use. The method can be considered
to be based, in part, on the p-adic square roots of —1, but it is also possible to
avoid the use of this language.
A program based on this sieve method was written for an IBM 704 with a
32,768-word high-speed memory, and with this program all n + 1 from n = 1
to 180,000 were completely factored in about 10 minutes. Since these factoriza-
tions of n + 1 exceed those in existing published tables, (82 percent of these
numbers are greater than a billion), a short summarizing statistical table should
be of interest. In the table below, P(N) is the number of primes of the form n + 1
for 1 ¿ n ^ N, and for comparison, ir_(N) is the number of primes of the form
4m — 1 for 1 < 4m — 1 g N. Further, R(N) is the number of reducible numbers
£N, r(N) = R(N) - R(N - 10,000), and SR(N) = R(N)/N, the mean density
of the reducibles.
2. The Sieve. The sieve method (substantially more complicated than that of
Eratosthenes) is based on the facts listed in the following theorem. Since many
of these are well known and the rest can be easily verified, no proof need be given.
Theorem. For every prime p of the form 4m + 1 and every positive integer k
there are two and only two positive solutions of:
| n <pk
n2 + 1 = 0 (mod pk).
If we call these Ak and Bk, then
(2) Ak + Bk = p",
and if
(3) hm\(p+l)Ai (modp),
and
(4) Ck m h(Ak2 + l)/p* (modp),
the Ak for k = 2, 3, • • • may be computed recursively from Ai by
(5) At+i = Ak + Ckp\
Further, for all positive n,
n + 1= 0 (mod p ),
if and only if n is given by one of the linear forms :
¡Ak + mpk
(6) „ = (m = 0,1,2, •••),
[Bk + mpk
and aside from these factors of n2 + 1 (obtained as p runs through all primes of
the form 4m +1) the only other prime-power factors are the obvious
(7) n + 1= 0 (mod 2) for n = 1 (mod 2).
We will adopt the convention that Ai is the smaller of the two roots of (1)
for k = 1, and note that this implies:
(8) A1 < \p < Pi,
but it does not imply that Ak < Bk if k > 1. For example, let p = 5. Then Ax = 2.
Thus Ä = 1, and for k = 1, 2, 3, • • ■ we compute:
Ak = 2, 7, 57, 182, 2057, 14557, • • •
Bk = 3, 18, 68, 443, 1068, 1068, • ■•
and note, At > B6.
We also have in this case the interesting degeneracy B& = P6. While similarly,
for p = 13, we have the degeneracy A3 = At, it can be seen from (1), and it is
important for the validity of the sieve method, that for every p:
(9)
UÏ + l)/p á lAi,
and therefore cannot contain a second prime which (again by (8)) must be >2Ai.
This completes the proof, and shows we are obtaining complete factorizations
without the use of any trial divisions and without the need to know in advance
the primes of the form 4m + 1. They are, in fact, being generated (though not
in numerical order) by the process itself.
3. The Program. In the 704 program mentioned above the following changes
were desirable or expedient.
a.) The sieve was started with — (n + 1) in the n'th cell and the absolute
value of each quotient was stored back in. If no division occurred in a particular
cell the contents of this cell remained negative and this indicated that n + 1 was
a prime.
4. The Primes. Now consider P(N) in the statistical table. We see a steady
growth which is nearly proportional to w-(N). Since
2h mx
by the prime number theorem, the numbers P(N) are in excellent agreement with
the conjectured formula [3] of Hardy and Little wood:
fN dx
(10) P(N) -0.68641 J2
/ ~
In x
The constant in (10) is equal to the infinite product, taken over all odd primes,
Iö['-(tK»-]'
where ( —1/p) is the Legendre symbol. It was computed by A. E. Western [4]
who also verified the substantial correctness of (10) to N = 15,000. Since the
infinite product converges too slowly, Western used a transformation of the product
due to Littlewood. The following related formula is simpler:
(10b) l-|Mk.&.(,+irh)I
where L(4) = X? ( —1)" (2n + l)-4. With this improvement, the first two p's,
i.e., 5 and 13, already suffice to yield the five decimal places shown.
Among the Gaussian integers, a + bi, the Gaussian primes on the positive real
axis are those of the form 4m — 1, and thus are those counted by w-(N). On the
other hand, n + i is a Gaussian prime if and only if n + 1 is a rational prime.
Therefore, the column headed "P(N)/w-(N)" shows that the +1 horizontal
line in the Gauss plane (and therefore also the —1) has about a 37 percent greater
density of primes than the axis has. (For an attractive picture of these primes see
van der Pol's Tea Cloth, [5].)
It is of interest to mention briefly the "Gaussian twin" primes, i.e., those where
(n — l)2 + 1 and (n + l)2 + 1 are both prime. They are not at all rare. In the
last block of 1000 numbers with L = 184,500, we find no less than 9 pairs, namely:
n = 183585;183635;183685;184055;184075;184145;184185;184325;and 184495.
The last pair, 34038036037and 34038774017,were the largest primes obtained
by this program. The conjecture suggests itself that there are infinitely many
such "twins."
It may also be mentioned that (although they were not particularly sought)
the program yielded a very large collection of "large" primes, i.e., those over 10
digits. Not only are 4830 of then2 + 1 "large" primes, but many others are equal
to twice a "large" prime. For example, 1844992 + 1 = 2-17019940501.
If no such linear combination is possible, the number is irreducible [6]. For example,
since
tan- 3 = 3 tan- 1 — tan- 2,
2 J2 in m
Now in fact we have seen that P(N) remains persistently and significantly higher.
And this greater prevalence of primes is consistent with the smaller fraction of
reducibles (.293 instead of .307) which is observed. It is, of course, not precluded
that 5R(N) will rise to .307 for very much larger N. Its slow growth, mentioned
above, seems to be of order of a/In N and to be associated with the falling off in
the mean local density of the excess primes, P(N) — t-(N). This (average)
increase of &r(N) hardly shows in the Table since In N changes so slowly and the
fluctuations are almost of the same size. Nonetheless it is there and may carry
SR(N) nearly up to 0.307.
The gross facts on which any attempted assessment of the "good sample"
concept must be based are these. The n + 1 numbers have only about 5 of all
primes, [9], as possible factors, but in "compensation" these factors occur twice
as often (see (6) above). While this may suggest a factorability of the same order
of magnitude, certainly no more exact equivalence is implied.
Assuming the future establishment of R(N) ~ Sa-N, the (deeper?) question
of 0(R(N) — Sr-N) will arise. Concerning this question—that of the uniformity
of the distribution of the reducibles—the following table and figure are informative.
Each of the 1800 intervals of 100 numbers:
100m < n ^ 100(m +1) (m = 0, 1, • • -, 1799)
has at least 17 reducible numbers and at most 42. The 52,837 reducibles ^ 180,000
are distributed as follows:
r, no.of reducibles 17] 18j 191 20121122123124125) 261 27| 281 29
v,no.of intervals 5| 6| 5| 13122129169[821931127115411471167
301 311 321 331341351361371381
39| 401 411 42
179114011471130193¡721411311161
16| 111 l| 4
In the Figure a bar graph of this distribution is compared with a binomial dis-
tribution, v (r):
7. P-adic Numbers and Degeneracy. The sequence of the partial sums of the
infinite series:
(15) ¿i + XC*p*,
where p is a prime of the form 4m + 1 and Ai and the Ck are determined by (1)
through (4), is a convergent sequence in the p-adic sense. (See [10] for an elementary
account of p-adic valuation.) Since it converges, it represents a p-adic number,
namely, one of the two values of y/—l. This sequence, it is seen from (5), is the
sequence Ak. The other sequence, Bk, converges in the p-adic sense to the other
\/ —1 and their sum, Ak + Bk , converges in the p-adic sense to 0.
As an example, take p = 5, and Ak = 2, 7, 57, 182, etc., as above. If we write
them in the quinary system we see that they represent the sequence obtained
from the 5-adic number
u=l800r°J(0.2935)
(0.7065)
Fig. 1.—Distribution of the reducible numbers between 1 and 180,000 into the 1800 in-
tervals of 100.
by starting at the quinary point and taking more and more places to the left. If
we take k places and carry the computation to at least k places we find that
Ak2 + 1 = 000 - • 00.
k zeroes
so that Ak 4- 1 is divisible by p and Ak is an approximation to y/ —1 correct
(p-adic sense) to k places. Similarly the Bk sequence gives the complement:
8. The Difficulty of the Unsettled Questions. It has often been remarked that
questions like (13) are "very difficult." The intent of this section is to assess this
difficulty. We do this by comparing the very simple Eratosthenes sieve for the
ordinary prime sequence with the present one for n 4- 1 and note that the latter
is more complicated in the following three ways:
1.) Instead of one linear form, mp, for each prime, we have a double infinity:
Ak + mpk, Bk 4- mpk.
2. ) Instead of a zero origin we have Ak and Bk origins, which are not related to
p in any simple fashion. While A2, A3, ■• • and Bt , B2, • • • can be computed by
the more or less complicated relations (2) through (6), A\ can arise at any n
satisfying \/p — 1 ;S n ^ (p — l)/2. Further, we have the complications of
occasional degeneracy.
3.) Finally, whereas in the Eratosthenes sieve it is not necessary to divide, but
it suffices to scratch the cells in the linear form, here we must divide the p out.
Otherwise, we would not obtain the new prime hidden in each vli 4- 1 which is
not itself prime.