Examples

fp64lib comes with build-in examples, directly available in the Arduino IDE via File/Examples/fp64lib. Some of the examples are explained below.

1. Leibniz Series, calculating PI/4

One of the most famous series to approximate Pi was created by Gottfried Leibniz [see Wikipedia]. It states that

PI/4 = 1 - 1/3 + 1/5 - 1/7 + 1/9 - ...

That series is nice looking but has a serious drawback: it converges extremely slowly: calculating PI to 10 correct decimal places requires summation of about 5 billion terms.

However, that slow confergence is a great example to show the difference between regular float mathematics and fp64lib. Implementation is straight forward:

/* Example program for using fp64lib.
* Leibniz formular to approximate PI
*/

#include <fp64lib.h>

#define MAX_COUNT 1000 // # of iterations before printing another approximation

float64_t x, n, two, pi, pi4, prev;
float xf, nf, pif, pi4f, prevf;
bool add = false;

void setup() {
prev = x = fp64_sd(1.0);
n = fp64_sd(3.0);
two = fp64_sd(2);

prevf = xf = 1.0;
nf = 3.0;

pi = float64_NUMBER_PI;
pi4 = fp64_ldexp( pi, -2 ); // fast division by 4
pif = M_PI;
pi4f = ldexp( pif, -2 );

Serial.begin(57600);
}

char *fixedLength( char *s, float64_t x ) {
// return a string always with the same length, including trailing zeros
char *t = fp64_to_string( x, 17, 15 );
sprintf( s, "%s%s00000000000000000", t[0] == '-' ? " " : " ", t );
s[18] = '\0';
return s;
}

void printLine( float nf, float64_t x, float xf, const char* s, bool usePi ) {
char buf[40];

float64_t delta = fp64_sub( x, usePi ? pi : pi4 );
float deltaf = xf - (usePi ? pif : pi4f );

char *t = fp64_to_string( x, 17, 15 );
Serial.print( nf, 0 ); Serial.print( s ); Serial.print( ":" );
Serial.print( fixedLength( buf, x ) );
Serial.print( fixedLength( buf, delta ) );
Serial.print( "\t" );

Serial.print( " " ); Serial.print( xf, 7 );
Serial.print( " " );
if( deltaf >= 0.0 )
Serial.print( " ");
Serial.println( deltaf, 7 );
}

void loop() {
int cnt = 0;

while( 1 ) {
prev = x;
prevf = xf;

if( add ) {
float64_t t = fp64_inverse(n);
x = fp64_add( x, t );
xf = xf + 1.0/nf;
} else {
float64_t t = fp64_inverse(n);
x = fp64_sub( x, t );
xf = xf - 1.0/nf;
}
add = !add;

++cnt;
if( cnt == MAX_COUNT ) {
float64_t approx = fp64_ldexp( fp64_add( prev, x ), 1);
float approxf = ldexp( xf + prevf, 1 );
printLine( nf-2, prev, prevf, " low", false );
printLine( nf, x, xf, " high", false );
printLine( nf, approx, approxf," avg", true );
printLine( nf, pi, pif, " pi", true );
Serial.println();
cnt = 0;
}

n = fp64_add( n, two );
nf = nf + 2.0;
}
}

Let the Arduino run for a few minutes and take a look on the output. Here every 50000 iterations were calculated and the output shows the following:

 99999  low:  0.78539316339745 -0.00000500000000 0.7853940 -0.0000042
100001 high: 0.78540316329745 0.00000499990000 0.7854040 0.0000058
100001 avg: 3.14159265338980 -0.00000000019999 3.1415958 0.0000031
100001 pi: 3.14159265358979 0000000000000000 3.1415927 0.0000000

199999 low: 0.78539566339744 -0.00000250000001 0.7853965 -0.0000017
200001 high: 0.78540066337244 0.00000249997499 0.7854015 0.0000033
200001 avg: 3.14159265353976 -0.00000000005003 3.1415958 0.0000031
200001 pi: 3.14159265358979 0000000000000000 3.1415927 0.0000000

299999 low: 0.78539649673077 -0.00000166666667 0.7853974 -0.0000008
300001 high: 0.78539983005299 0.00000166665554 0.7854008 0.0000026
300001 avg: 3.14159265356753 -0.00000000002225 3.1415963 0.0000036
300001 pi: 3.14159265358979 0000000000000000 3.1415927 0.0000000

The Leibniz series always creates a value that is either lower than PI/4 (indicated with “low” in the output) or higher than Pi/4 (indicated with “high”). So, ideally taking the average of the low and the high value, multiplying that by 4, should give a good approximation of PI, listed in the line with “avg”. And finally, the value to compare with is in the line tagged “pi”.

Already after 50000 iterations (n=100001), the value of Pi for fp64lib is 3.14159265338980, precise for  9 digits after the decimal point. The low and the high value are nicely distributed on both sides of Pi/4.

On the other side, our float calculation results in PI being 3.1415958, which is accurate to 5 digits only.  And you can see, that all the small errors due to the reduced precisions are adding up, as both the low and the high value are above Pi/4, so no chance to come to a good approximation of PI.

This can be verified by taking a look on the result ofter after 500000 iterations:


999999 low: 0.78539766339745 -0.00000050000000 0.7853985 0.0000003
1000001 high: 0.78539866339645 0.00000049999900 0.7853995 0.0000013
1000001 avg: 3.14159265358781 -0.00000000000198 3.1415958 0.0000031
1000001 pi: 3.14159265358979 0000000000000000 3.1415927 0.0000000

Precision for fp64lib has improved from 9 digits to 11 digits, but it stayed at 5 digits for float.

Having said that, I just can repeat my inital comment: the above algorithm is not very well suited to calculate PI. And, fp64lib will also face its challenges as float did, only some million calculations later:

2043999  low:  0.78539791877916 -0.00000024461828 0.7853989  0.0000007
2044001 high: 0.78539840801571 0.00000024461827 0.7853993 0.0000011
2044001 avg: 3.14159265358976 -0.00000000000002 3.1415963 0.0000036
2044001 pi: 3.14159265358979 0000000000000000 3.1415927 0.0000000

This is the point where the best precision is achieved, with PI being accurately approximated to 13 digits. After that, rounding errors in fp64lib will also sum up, slowly, but noticeably:

4999999  low:  0.78539806339805 -0.00000009999939 0.7853991  0.0000009
5000001 high: 0.78539826339801 0.00000010000056 0.7853993 0.0000011
5000001 avg: 3.14159265359213 0.00000000000234 3.1415967 0.0000041
5000001 pi: 3.14159265358979 0000000000000000 3.1415927 0.0000000

5999999 low: 0.78539808006491 -0.00000008333254 0.7853991 0.0000009
6000001 high: 0.78539824673155 0.00000008333410 0.7853993 0.0000011
6000001 avg: 3.14159265359292 0.00000000000313 3.1415967 0.0000041
6000001 pi: 3.14159265358979 0000000000000000 3.1415927 0.0000000

2. Double: A small C++-Wrapper

Some will not like that fp64lib is not integrated with gcc, resulting in some code that does not reflect the “natural” mathematical notation with +, -, * and /, but create statements in it like

float64_t t = fp64_inverse(n);
x = fp64_add( x, t );

How much “easier” would it be if we could just write

x = x + 1/n

There is a way. As the Arduino platform supports C++, you could wrap fp64lib with some operator functions. The following code provides a minimal wrapper:

#include <fp64lib.h>

class Double {
public:
Double() { x = 0ULL; }
Double( double f ) { x = fp64_sd( f ); }
Double( float f ) { x = fp64_sd( f ); }
Double( const Double& d ) { x = d.x; }
const char* toString() {
return fp64_to_string( x, 17, 15);
}
Double operator+( const Double& y ) {
Double res;
res.x = fp64_add( x, y.x );
return( res );
}
Double operator-( const Double& y ) {
Double res;
res.x = fp64_sub( x, y.x );
return( res );
}
Double operator*( const Double& y ) {
Double res;
res.x = fp64_mul( x, y.x );
return( res );
}
Double operator/( const Double& y ) {
Double res;
res.x = fp64_div( x, y.x );
return( res );
}
private:
float64_t x;
};

With the above wrapper, the Leibniz series example from above can be rewritten into:

#define MAX_COUNT 1000 // # of iterations before printing another approximation
Double x = Double( 1.0 );
Double n = Double( 3.0 );
Double delta = Double( 2.0 );
bool add = false;

void setup() {
Serial.begin(57600);
}

void loop() {
int cnt = 0;

while(1) {
if( add )
x = x + Double(1.0)/n;
else
x = x - Double(1.0)/n;
add = !add;
if( ++cnt == MAX_COUNT ) {
Serial.print( n.toString() );
Serial.print(": ");
Serial.print( x.toString() );
Serial.print("\t");
Double pi = x * Double(4.0);
Serial.println( pi.toString() );
cnt = 0;
}
n = n + Double(2.0);
}
}

Looks nice! But as in real life, beauty comes with a price. And here it comes with a double price: you need more memory (2328 instead of 2212 bytes of flash memory) and you need more execution time: 98,475s instead of 91,682s until n=100001.

If you think, that this is not a big issue, be aware that the above wrapper is only a minimum implementation. You certainly would like to include more of fp64lib functions (like conversion from/to int) and more operators (like comparision) – and with that your overhead will increase, violating Design principle #3: Minimize code size! Therefore, a “fatter” C++ wrapper will not be supplied with the library.