مرکز منطقه ای اطلاع رساني علوم و فناوري - On the precision attainable with various floating-point number systems

DocumentCode :

3344883

Title :

On the precision attainable with various floating-point number systems

Author :

Brent, R.P.

Author_Institution :

Math. Sci. Dept., IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA

fYear :

1972

fDate :

15-16 May 1972

Firstpage :

Lastpage :

Abstract :

For scientific computations on a digital computer the set of real numbers is usually approximated by a finite set F of “floating-point numbers”. We compare the numerical accuracy possible with different choices of F having approximately the same range and requiring the same wordlength. In particular, we compare different choices of base (or radix) with the usual floating-point systems. The emphasis is on the choice of F, not on the details of the number representation or the arithmetic, but both rounded and truncated arithmetic are considered. Theoretical results are given, and some simulations of typical floating-point computations (forming sums, solving systems of linear equations, finding eigenvalues) are described. If the leading fraction bit of a normalized base-2 number is not stored explicitly (saving a bit), and the criterion is to minimize the mean square roundoff error, then base 2 is best. If unnormalized numbers are allowed, so the first bit must be stored explicitly, then base 4 (or sometimes base 8) is the best of the usual systems.

Keywords :

digital computers; floating point arithmetic; mean square error methods; number theory; roundoff errors; set theory; digital computer; finite set; floating-point number systems; leading fraction bit; mean square roundoff error; normalized base-2 number; number representation; numerical accuracy; real numbers; rounded arithmetic; scientific computations; truncated arithmetic; Computers;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Computer Arithmetic (ARITH), 1972 IEEE 2nd Symposium on

Conference_Location :

New York, NY

Type :

conf

DOI :

10.1109/ARITH.1972.6153914

Filename :

6153914

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3344883