Today's post is a discussion only, without accompanying code. This is because
the topic is divrem using the "school boy" algorithm, and this is not as
straightforward as one might imagine. The discussion below is informal, and
may contain errors. Please let me know if so and I can make some adjustments
before releasing the code for the next article, where we implement this.
As a consolation, I have released a failed attempt at using C89 macros to make
even more simplifications to our test code. In particular, I tried to make the
chain tests more "automatic". Unfortunately, this didn't work out. It ended up
making the test code too complicated to use and it was going to be way too much
work to convert all of it over to the new test system.
This test code "simplification" was completely abandoned and will not be merged.
But if you are curious you can see it here: gentest
In particular, take a look at generic.c/h and the test code in t-nn.c.
Anyhow, back to today's discussion:
Firstly, what exactly do we mean by the school boy division algorithm?
Certainly I leaned how to do 39 divide 7 at school, and even 339 divide 7.
But how about 10615 divide 1769?
The first step is to divide 1769 into 1061. It doesn't go, so we grab another
digit. So it becomes 1769 into 10615. I have no idea how many times it goes!
Either I missed something at school when I studied "long division", or I only
think I learned an algorithm suitable for problems like this.
In fact, the only way I know to do this division is trial and error, i.e.
to go through all the likely quotient candidates from one to nine.
Using a calculator I find it goes 6 times.
I think I intuitively know how to proceed from here. We place the digit 6
into our quotient, multiply 1769 by 6 and subtract to form a remainder.
But now we have a problem. What if we are dividing 17699949 into 106150000.
Does it still go 6 times? In fact it does not. But how would I know this
without being able to divide 17699949 into 106150000 in the first place!
So I have two problems. Firstly, if I simply truncate the numerator and
denominator to a certain number of digits, even the first digit of my
quotient may be wrong, and secondly, if I use more digits in the first place
my intiuitive "algorithm" doesn't help. It basically says, guess and refine.
That's all very well, but when my digits are 64 bit words, I may not be so
lucky with the guessing thing.
Now an interesting thing to note in the example above is that no matter how
many digits we add to the dividend, 1769 will always go into the first 5
digits 10615 of the dividend, 6 times (with some remainder), no more, no
To see this, we note that 6*1769 is less than 10615 and 7*1765 is greater.
So the only way we could get 1765 to go in 7 times would be to increase
those first five digits of the dividend. Any following digits are irrelevant
to that first digit, 6, of the quotient.
This is a general feature of integer division, and also applies for the
word based (base B) integer division problem.
However, adding digits to the divisor is not the same, as the example above
shows. In fact 17699949 only goes 5 times into 106150000.
Now the question is, how far off can it be if I truncate the dividend and
divisor? How bad can it get?
We can answer this as follows. Note that 17699949 < 17700000. So 10615xxxx
divided by 1769yyyy is greater than or equal to 10615 divided by 1770, which
happens to be 5. So the answer has to be either 5 or 6 times.
What I have done here is simply add 1 to the first four digits of the
divisor, to get a lower bound on the first digit of the quotient. In this
way, I can get a small range of possible values for the first digit of the
quotient. Then I can simply search through the possibilities to find the
correct quotient digit. This matches my intuition of the long division
algorithm at least.
It seems that *perhaps* a quotient digit obtained in this way will be at
most out by 1. In other words, roughly speaking, if X is the first few digits
of the dividend and Y the appropriate number of digits of the divisor (in the
example above, X = 10615 and Y = 1769), then perhaps X / Y >= X / (Y + 1)
>= (X / Y) - 1, where by division here, I mean integer division. Let's call
this condition R.
Let's suppose now that we are working on integers written as binary words
instead of decimal digits. Let's also suppose that X / Y < B. Let's call
this condition S.
Well, it is easy to generate a counterexample to condition R. Let Y = 2,
X = B. Clearly R is not satisfied, even though S is.
But this seems to only be a problem because Y is so small. So, what if we
constrain Y to be at least B/2?
It turns out that condition R still doesn't hold. A counterexample is
Y=B/2, X = B*Y - 2.
However, I claim that X / (Y + 1) >= (X / Y) - 2 does hold, so long as
condition S holds. Let us call this inequality T.
Clearly there does not exist a counterexample to T with X <= Y.
Let us suppose that for a certain Y >= B/2, there is a counterexample to T.
Clearly if X0 is the first such counterexample, then X0 is a multiple of Y,
but not of Y + 1. Nor is X0 - 1 a multiple of Y + 1 (else X0 - 2 would have
been a counterexample).
It is also clear that every multiple of Y from X0 on is a counterexample, as
the left hand side can never "catch up" again.
Thus we have the following result: if there exists a counterexample to T for
some Y >= B/2 and X < BY, then X = (B - 1)*Y is a counterexample.
Substituting this value of X into T yields that T holds if and only if
(B - 1)*Y / (Y + 1) >= B - 3 for all Y >= B/2. Thus, if we can show that this
last inequality holds for all Y >= B/2 then we have proven T.
Note that this inequality is equivalent to (B - 1)*Y >= (Y + 1)*(B - 3), i.e.
2*Y >= B - 3. However, as Y >= B/2, this holds under our hypotheses.
So putting all of the above together, suppose we are dividing A by D and that
Y >= B/2 is the leading word of the divisor D and B*Y > X >= Y is the first
word or two words of the numerator A (whichever is appropriate). Then if Q0
is the leading word of the quotient Q = A / D, then we have shown that
X / Y >= Q0 >= (X / Y) - 2.
In other words, X / Y can be no more than 2 away from the first word of the
quotient Q = A / D that we are after.
This leads us immediately to an algorithm for computing a quotient of two
(i) Let Y be the first WORD_BITS bits of the divisor D, so that
B > Y >= B/2.
(ii) Let X be the first word or two words (appropriately shifted as per Y) of
A such that B*Y > X >= Y.
(iii) Let Q0 = X/Y.
(iv) Set R = A - D*Q0 * B^m (for the appropriate m), to get a "remainder".
While R < 0, subtract 1 from Q0 and add D * B^m to R.
(v) Write down Q0 as the first word of the quotient A / D and continue on by
replacing A by R and returning to step (i) to compute the next word of the
Another algorithm can be derived as follows. Instead of using Y >= B/2 in the
above algorithm, let's choose Y >= B^2/2. A similar argument to the above
shows that X / (Y + 1) >= (X / Y) - 1 for Y >= B^2/2 and B*Y > X >= Y. It boils
down to showing that (B - 1)*Y / (Y + 1) >= B - 2 for Y >= B^2/2, i.e. that
Y >= B - 2, which is clearly satisfied under our hypotheses.
The algorithm is precisely the same, but at step (iv) we can replace the while
statement with an if statement and perform at most one adjustment to our
quotient Q0 and to R.
Now we return to step (v), in which we said that we could just continue
on from step (i) to compute the next word of the quotient Q. If we do this
and set A = R then compute the new X, what we would like is something like
the divrem1 algorithm we implemented, where, (after possibly some kind of
initial iteration that we handled specially), it is always true that the new
X is two words and has its first word reduced mod Y.
However, this does not follow from 0 <= R < D*B^m, and it may be that the
first word of the remainder is equal to Y! This is due to the truncation of
R to get Y.
It is clear from 0 <= R < D*B^m that if X is the first two words of the
remainder that X <= B*Y. So to make the algorithm continue in the way we'd
like, we only have to deal with the special case where X = B*Y.
We know that in the first algorithm above, the next word of the quotient may
be B - 1 or B - 2, since we know already that it is not B. We must multiply
out and check which it is.
In the second algorithm above, where Y >= B^2/2, the only possibility for
the next quotient word is B - 1, as we know it is not B.
In the next article we will look at implementing the first of the two
algorithms above, leveraging the code we already have for computing and using
a precomputed inverse.
Previous article: v0.13 - muladd1, muladd
Next article: v0.14 - divrem_classical