It's time for our first division function. This will again be a linear function

in that it divides a multiprecision integer by a single word and returns the

(single word) remainder.

This is the first time we can do substantially better than a simple for loop

(other than when we coded nn_cmp and other functions, which allowed early exit).

To get us going, we'll write a really dumb division function first. It'll just

use C's "/" and "%" operators in a for loop.

C is supposed to optimise simultaneous occurrences of / and % together in the

same piece of code to use a single processor instruction where available.

However, even with that optimisation, our function will be exceedingly slow.

We'll see why that is when we implement a not-stupid divrem1, next time around.

To emphasise its stupidity, we'll call our exceedingly dumb division function

nn_divrem1_simple. Perhaps we can use it in our test code to compare against

when we write the more sophisticated version.

Hopefully our naming conventions won't be defeated by this function, which

proceeds left to right (or most significant word first).

One doesn't usually need to write the remainder down immediately following

the quotient (though one could imagine a function operating that way).

However, it is convenient to be able to accept a "remainder-in", essentially

thought of as an extra limb of the dividend, so that divrem can be chained.

In this way, divrem1 is most similar to right shift in semantics (the latter

is just division by a power of two, after all).

Now, if we blindly try to implement this function, we quickly discover a

problem. Even if we work with double words, a division of a two limb quantity

by a single limb can be problematic.

We can illustrate this using decimal digits, instead of machine words. Suppose

we divided 94 by 7. The resulting quotient is 13 with remainder 3 (I hope!).

The point is, the quotient takes up two "words" 1 and 3, not just a single

word.

We can get around it by reducing the top word first. We have 9 divide 7 is 1

remainder 2. Thus the first word of our quotient is 1. Now we are left with

24 divide 7. This time the quotient is 3 with remainder 3 and everything is

fine. Note the remainder, 3, is less than 7.

The critical thing to note is that once we get things started, everything

works fine from then on. The top word in our double words are always reduced

modulo the divisor after the first iteration.

So the almost trivial algorithm almost works, except for the lead-in, where

we have to get things right. After that, we can divide our dividend word on

word until we reach the bottom, where we'll have a single limb remainder.

Now as mentioned, it is also convenient to supply a "carry-in". This acts like

an additional limb of the dividend. Assuming this carry-in was reduced mod

the divisor, we find that we end up with as many words in our quotient as we

had in our original dividend.

Clearly however, if the carry-in is not reduced, we will end up with even one

more quotient limb! However, there is an easy way to avoid this problem.

We simply decide to restrict the carry-in to be reduced modulo the divisor!

This restriction is not a problematic restriction because we will still be

able to chain our division functions together with this restriction (so long

as we use the same divisor throughout).

Anyhow, with this restriction, the quotient will have precisely the same number

of limbs as the dividend, and since we'll start with the carry-in, not the top

limb of the dividend, and that carry-in will already be reduced, there is no

bootstrapping iteration required, i.e. the problem of a non-reduced first limb

as explained above, simply doesn't exist.

In this way we avoid any complex feed-in and wind-down code, and as for all

the functions so far, the length m can be zero. The result is satisfyingly

symmetric, even if it is terribly slow.

In the test code, if q = a / c with remainder r, we can check that a = q*c + r.

We can use the functions we have already implemented, including our mul1_c code

to verify this.

We also check that passing in a reduced carry-in results in a correct result,

by chaining two divisions together.

This will do for today, as a more serious implementation of divrem1 will

require much more serious code. We'll take a look at that next time.

The github branch here v0.7 is for this post.

Previous article: bsdnt v0.6 - addmul1, submul1

Next article: bsdnt assembly language

## No comments:

## Post a Comment