I likely won't get much more done on BSDNT until early next year, but here is what I am planning:
1) I am tremendously grateful to Dr. Brian Gladman for his work on an MSVC version of the library. However, I started to struggle to keep up with this side of things more than I thought. Microsoft's MSVC doesn't support inline assembler in 64 bit x86. This means the entire plan of the MSVC version has to be different. It seems like far too much effort to combine both sets of code into a single library. I've therefore decided (sorry Brian) to ditch the MSVC code from my copy of BSDNT.
2) I wasn't happy with the interface of the division code. There are a few issues to consider.
The first issue is chaining. Obviously carry-in occurs at the left rather than the right. But for general division functions should the carry-in be a single limb or multiple limbs. It seems like the remainder after division is going to be m limbs and so the carry-in should be also. It is not clear what is better here. Internally, the algorithms deal with just a single carry-in limb because they use 2x1 divisions to compute the quotient digits. Perhaps chaining just means that we consider the first digit of the remainder to be carry-in for the next division.
Another problem associated with this is that when reading the carry-in from the array, if the carry-in happens to be zero then the array entry may not exist in memory. This means the code has to always check if the carry-in should be zero or not before proceeding.
The other issue to consider is shifting for normalisation. One assumes that the precomputed inverse is computed from a normalised value (the first couple of limbs of the divisor). Now, it is not necessary to shift either dividend or divisor ahead of time. One can still perform the subtractions that occur in division, on unshifted values. One does need to shift limbs of the dividend in turn however, as the algorithm proceeds, in order to compute the limbs of the quotient. But this shifting can occur in registers and need not be written out anywhere. This is implemented, but currently every limb gets shifted twice. This can be cut down to a single shift.
3) The PRNGs are currently quite hard to read. They have numerous macros to access their context objects. They are extremely flexible, but possibly overengineered. I'd like to simplify their implementations somewhat.
4) The configure script is a little overengineered. The idea of supporting lots of compilers is nice. But in reality GCC should exist almost everywhere. The original concept of BSDNT was to use inline assembly for architecture support. This gets around issues with global symbol prefixes and wotnot. It also makes the library really simple to read. Even on Windows 64 there is MinGW64 and this is the only setup I aim to target in that direction.
I hope to deal with all of these issues before proceeding with development of BSDNT. Give me some time as I am busy until about the end of the year. However, I do plan to continue development of BSDNT after sorting out these issues, because I think that fundamentally what we have is very solid.
Previous article: v0.24 = nn_bitset/clear/test and nn_test_random