Academic
Publications
On the precision attainable with various floating-point number systems

On the precision attainable with various floating-point number systems,Computing Research Repository,Richard P. Brent

On the precision attainable with various floating-point number systems   (Citations: 20)
BibTex | RIS | RefWorks Download
For scientific computations on a digital computer the set of real number is usually approxi- mated by a finite set F of "floating-point" numbers. We compare the numerical accuracy possible with dierence choices of F having approximately the same range and requiring the same word length. In particular, we compare dierent choices of base (or radix) in the usual floating-point systems. The emphasis is on the choice of F, not on the details of the number representation or the arithmetic, but both rounded and truncated arithmetic are considered. Theoretical results are given, and some simulations of typical floating-point computations (forming sums, solving systems of linear equations, finding eigenvalues) are described. If the leading fraction bit of a normalized base-2 number is not stored explicitly (saving a bit), and the criterion is to minimize the mean square roundo
Journal: Computing Research Repository - CORR , vol. abs/1004.3, 2010
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
    • ...Rad- ices bigger than 2 may oer a mininuscule speed advantage during normalization because the leading few signicant bits can sometimes remain zeros, but this advantage is more than oset by penalties in the range/precision tradeo [25] and by \wobbling precision" [19, p.7]...

    W. Kahan. Why do we need a oating-point arithmetic standard?

    • ...We use R*-rounding [15, 33] after postnormalization, with four guard digits...

    RICHARD P. BRENT. A Fortran Multiple-Precision Arithmetic

    • ...(The latter condition iscomplicated, aswill beshown inSection IV.) Thispaper examines themethods underconsideration withrespect tothree indicators of rounding scheme effectiveness, viz., 1)average relative representation error (ARRE); 2)rounding scheme bias; 3) statistical tests inactual computation...
    • ...7 lowbits Fig. 1.Eight-bit ROM rounding. <{[SECTION]}>4,and16arithmetic systems. Theydefined theARRE of...
    • ...PTruncation(e = 60) Ia-1(3- 1)/ln A, {(½1_t-1)/1n1,...
    • ...PTruncation(e = 60) Ia-1(3- 1)/ln A, {(½1_t-1)/1n1,...
    • ...PRounding((= 60) It-1(2 - 1)/ln A, ( _ t-1)I/n A,...
    • ...PRounding((= 60) It-1(2 - 1)/ln A, ( _ t-1)I/n A,...
    • ...Notethat without guard digits thedifference 1.000 -0.9999 (oranysimilar computation) cannot becarried outwithout serious loss ofaccuracy andhugerelative error...
    • ...Notethat insubtraction, oneofthree possible things can occur: either 1)there isnoalignment shift, inwhich case theresult ofthesubtraction istheexact difference ofthe twooperands andnoguard digits areneeded; 2)analignmentshift ofonedigit occurs; or3)analignment shift or morethanonedigit occurs...
    • ...Ifyistherandom variable assigning tothemantissas in [1/f,1) aprobability ofoccurrence asaresult ofsome computation with t-digit floating-point operands, then6 istheexpected value...
    • ...Ifyistherandom variable assigning tothemantissas in [1/f,1) aprobability ofoccurrence asaresult ofsome computation with t-digit floating-point operands, then6 istheexpected value...
    • ...Brent[1] points outthat thelogarithmic lawisonlyanapproximation, butisa considerably better approximation tothedistribution of mantissas thantheassumption ofuniformity...
    • ...Thuswecancompile Table IIforanyradix A=2n.Here sandtrefer tothenumberofradix-: digits being used, whereas IistheROM length inbits (1<I<tin+ 1)...
    • ...Thuswecancompile Table IIforanyradix A=2n.Here sandtrefer tothenumberofradix-: digits being used, whereas IistheROM length inbits (1<I<tin+ 1)...
    • ...<{[SECTION]}>[3, table 1]; the"average" reflects Sweeney's result that...
    • ...in[1], theproblems typically consisted justofrepeated...
    • ...Fig. 2also seemstoindicate that astheradix growstheoptimal ROM length grows (here only 1 = 2,5,and9were checked)...
    • ...agreeswellwith[1]. Itwouldbeeasytopick aproblem that...
    • ...asuitably large ROMlength 1,isaviable rounding scheme inradix-2 andapparently (Fig. 2)forradices asbigas16...

    David J. Kucket al. Analysis of Rounding Methods in Floating-Point Arithmetic

    • ...Note that our assumptions rule out exotic number representations (for example, logarithmic [4] or modular [33,34] representations) in which it is possible to perform some (but probably not all) of the basic operations faster than with the standard representation...

    Richard P Brent. THE COMPLEXITY OF MULTIPLE-PRECISION ARITHMETIC 1

    • ...These studies havedealt with therepresentational error [2]-[5], analysis andsimulation todetermine theeffects ofround off[2], [3], [6], [7], [13], [15], andthedetermination ofrigorous bounds fortherelative error intheevaluation ofmathematical functions andthemachine design necessary to obtain these bounds[8]-[12], [14]...
    • ...andthemaximumrelative representational error (MRRE) overallnormalized fractions isMRRE(t,3) =2-t-1l...
    • ...Theresults support anassertion ofBrent[3] that base4isnever less accurate thanthe corresponding binary representation...
    • ...Brent[3] uses aroot-mean-square (rms) measure forthe relative error tocompare different base:=2nnormalized floating-point numbersystems withnearly identical ranges...
    • ...Brent[3] also shows thesuperiority oftheunbiased R modeovertheTmode...

    Harvey L. Garner. A Survey of Some Recent Contributions to Computer Arithmetic

Sort by: