Now that we have nn_mullow_classical, we can add nn_div_hensel. As explained,
this will take an integer a of length n and divide by an integer d of length
m modulo B^n, returning a quotient q and an "overflow".
The overflow will be two words which agree with the overflow from mullow(q*d).
As per the euclidean division, the dividend a will be destroyed.
The algorithm is somewhat simpler than the euclidean algorithm. If d1 is the
least significant word of d then we use an inverse mod B of d1 (dinv say) and
multiply it by the current word of the dividend being considered (working from
right to left) to get a quotient word q1.
We then subtract q1*d (appropriately shifted) from the dividend. There is no
adjustment to do as the inverse mod B is unique (so long as d is odd).
Any borrows and overflows from the subtractions are accumulated in the two
overflow and returned.
In our test code, we check a few things. Firstly, for an exact division, we
want that the quotient is really the exact quotient of a by d. As the
quotient returned is m words, which may be larger than the actual quotient,
we check that any additional words of q are zero. We do this by normalising q.
The second test we do is for an inexact division. We check that the the
overflow words turn out to be the same as the overflow from mullow(q*d).
Note that if we wish to make Hensel division efficient for doing an exact
division, say of a 2m - 1 by m division, we merely pass m words of the
dividend in instead of all 2m - 1 words, so that the quotient is also m words.
Once again we don't allow an overflow-in to Hensel division. This wouldn't
give us any kind of chaining property anyway.
Instead, we'd have to do a mulhigh(q, d) and subtract that from the high
part of the chain before continuing, and the mulhigh will accept our overflow
words from the low Hensel division.
In fact, we add a chaining test that does precisely this. We do a Hensel
division on the low n words of a chain, subtract a mulhigh from the high
m words of the chain, then compute the high m words of the quotient using
a second Hensel division.
To make our test code work, we add an ODD flag to randoms_of_len so that only
odd divisors are used with Hensel division.
It isn't clear if it makes sense to allow a carry-in at the top of div_hensel
or not. On the one hand, it might seem logical to allow a carry-in on account
of the way mul_classical works. On the other hand, divrem_hensel1 took a
carry-in, but at the bottom. This was for chaining rather than a read-in of
an extra word. We choose the latter convention, as it seems to make more
sense here and stops the code from becoming horribly complex.
The code for today's post is here: v0.17
Previous article: v0.16 - mullow_classical
Next article: v0.18 - printx_word, nn_printx