Date: Thu, 8 Aug 1996 20:01:02 -0700
To: tghack-list@cpac.washington.edu
From: daves@interlog.com (David Shadoff)
X-Software: MLF v2.3, Copyright 1995, 1996 by Bt
X-Original-Id: <199608090253.WAA24696@smtp.interlog.com>
Subject: More Math Subroutines !

Hi Gang...

Today, we have another several math function subroutines documented from
the CD system card...

$E0C0 = 8-bit Signed Multiply
$E0C6 = 16-bit Signed Divide
$E0CC = Square Root
$E0CF = sin()
$E0D2 = cos()


Multiply 8-bit (signed):
- see previous posts this week; same inputs/outputs used, but are interpreted
  as signed values, rather than unsigned

Divide 16-bit (signed):
- see previous posts this week; same inputs, except division-by-0 does not
  appear to be checked; same outputs, except that the signs are changed
  appropriately


Square Root:

Method of operation: This is a very interesting algorithm, which uses
                     2-shifts on the input number for every shift of the
                     'work number'.  Compare/subtract is employed at each
                     bit shift position.  It took me a long time to figure
                     out what the actual function of this routine was.

Input values:        $F8/$F9, a 16-bit integer
Output values:       $FC, an 8-bit integer
Registers Modified:  A, Y
Flags set:           None


SIN:

Method of operation: Simple table lookup.

Input values:        Register A = degrees of arc
                     (range = 0-90 degrees)
Output values:       Register A = 256 * (sine of angle), represented as an
                     unsigned integer value (ie. sin(86) = $FF)
Registers modified:  A, X
Flags set:           Carry set if:
                       (1) input value out of range (ie < 0 or > 90), or
                       (2) output value out of range
                           (ie. 256*sin(87) = 255.649 -> 256, which is
                              out of range)

COS:

Method of operation: Simple table lookup.

Input values:        Register A = degrees of arc
                     (range = 0-90 degrees)
Output values:       Register A = 256 * (cosine of angle), represented as an
                     unsigned integer value (ie. cos(4) = $FF)
Registers modified:  A, X
Flags set:           Carry set if:
                       (1) input value out of range (ie < 0 or > 90), or
                       (2) output value out of range
                           (ie. 256*cos(3) = 255.649 -> 256, which is
                              out of range)

-- Dave


----------------------------------------------------------------------------


Date: Tue, 6 Aug 1996 18:38:12 -0700
To: tghack-list@cpac.washington.edu
From: daves@interlog.com (David Shadoff)
X-Software: MLF v2.3, Copyright 1995, 1996 by Bt
X-Original-Id: <199608070131.VAA15401@smtp.interlog.com>
Subject: More CD system internals

Hi Gang...

More musings on the internals of the CD system card.  Today we have a
couple of math routines, and a page-swapping technique to discuss.

As we previously noted, the first (roughly) $100 bytes of the CD card
are a jump table to an API of commonly-used routines.

We already knew that $E009 was a 'read data sectors from disk' function.

Today, I will show how to use the following routines:

Address   Name used by develo   Function
$E0BD     'ma_mul8u'            Multiply 8-bit by 8-bit, unsigned
$E0C3     'ma_mul16u'           Multiply 16-bit by 16-bit, unsigned
$E0C9     'ma_div16u'           Divide 16-bit into 16-bit, unsigned

First of all, it is interesting to note that (at least in my 'reference'
version of the CD card) the 'jump addresses' point to multiple-depth
subroutines, of the following form:

         BSR   SWAPOUT
         JSR   FUNCTION
         BRA   SWAPIN

The first and last subroutines actually 'swap out' and 'swap in' an
MMR register, so that a different code page can be accessed.  The overall
function ends with a 'branch' (to a subroutine), which returns to the
original caller.

Here is a sample of swap-out/swap-in code:

SWAPOUT:
         PHA                 ; preserve accumulator
         TMA   #$40          ; obtain original MMR #6 ($C000-$DFFF)
         STA   TEMPMMR
         LDA   #ALT_PAGE
         TAM   #$40          ; replace MMR with new value
         PLA                 ; restore accumulator
         RTS 

SWAPIN:
         PHP                 ; preserve flags register
         PHA                 ; preserve accumulator
         LDA   TEMPMMR       ; get original MMR value
         TAM   #$40          ; replace MMR with original value
         PLA                 ; restore accumulator
         PLP                 ; restore flags register
         RTS 


One last note, then on to the math functions...
Since these three functions all use a 'rotate-and-add' algorithm, they
use a countdown loop.  Therefore, they use a countdown loop for the number of
bits to rotate.  So, the Z flag will always be set (N reset) upon completion
(because of the countdown).  Other flags will be undefined.


8-bit mulitiply:

This is a pretty nifty piece of work.  It uses a shift-and-add algorithm
for the multiply, and appears to be VERY speedy, since it does all of the
work in the A and X registers, rather than in memory.

Entry:             operands in $F8 and $FA on zero-page
Exit:              result in $FC/$FD on zero-page ($FC = LSB)
Registers changed: All
Flags set:         Z always set (N reset), others undefined


16-bit multiply:

This is simply an extension of the above algorithm, but should be
significantly slower on small numbers (assuming speed is a consideration),
since it uses zero-page memory for scratchpad work, instead of registers.

Entry:             operands in $F8/$F9, and $FA/$FB (1st byte = LSB)
Exit:              result in $FC/$FD/$FE/$FF ($FC = LSB, $FF = MSB)
Registers changed: A, X
Flags set:         Z always set (N reset), others undefined


16-bit divide:

This is an even more ingenious algorithm, where double-use of the result
field (which doubles as a temporary divisor) shifts the divisor into
the remainder field, and subtracts the dividend where applicable.
The ingenious part is that the carry flag from a succesful subtract
(dividend from remainder) is then automatically picked up by the next
rotate of the result field, gradually replacing the divisor with the
result, bit-by-bit.  I guess you've got to see it to appreciate it.

Entry:             - divisor ("large number") in $F8/$F9
                   - dividend ("divided by") in $FA/$FB
Exit:              - result in $FC/$FD, remainder in $FE/FF (1st byte = LSB)
                   - in the 
Registers changed: A, X
Flags set:         Z always set (N reset), others undefined

-- Dave