README - OpenGrok cross reference for /linux-4.4.14/arch/x86/math-emu/README

Lines Matching refs:the
9  |    it under the terms of the GNU General Public License version 2 as      |
10  |    published by the Free Software Foundation.                             |
12  |    This program is distributed in the hope that it will be useful,        |
13  |    but WITHOUT ANY WARRANTY; without even the implied warranty of         |
14  |    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the          |
17  |    You should have received a copy of the GNU General Public License      |
18  |    along with this program; if not, write to the Free Software            |
28 DJ Delorie for djgpp.  The interface to the Linux kernel is based upon
29 the original Linux math emulator by Linus Torvalds.
31 My target FPU for wm-FPU-emu is that described in the Intel486
33 facets of the functioning of the FPU are not well covered in the
34 Reference Manual. The information in the manual has been supplemented
36 possible to be sure that all of the peculiarities of the 80486 have
38 in the detailed behaviour of the emulator and a real 80486.
40 wm-FPU-emu does not implement all of the behaviour of the 80486 FPU,
48 For more information on the emulator and on floating point topics, see
61     is not the obvious one which most people seem to use, but is designed
62     to take advantage of the characteristics of the 80386. I expect that
68     upon the properties of Newton's method, and the code is once again
69     structured taking account of the 80386 characteristics.
73 (5) The argument reducing code for the trig function effectively uses
75     the reduced argument is accurate to more than 64 bits for arguments up
80 The code of the emulator is complicated slightly by the need to
81 account for a limited form of re-entrancy. Normally, the emulator will
83 However, it may happen that when the emulator is accessing the user
84 memory space, swapping may be needed. In this case the emulator may be
86 another process may use the emulator, thereby perhaps changing static
94 As from version 1.12 of the emulator, no static variables are used
95 (apart from those in the kernel's per-process tables). The emulator is
96 therefore now fully re-entrant, rather than having just the restricted
97 form of re-entrancy which is required by the Linux kernel.
101 There are a number of differences between the current wm-FPU-emu
102 (version 2.01) and the 80486 FPU (apart from bugs).  The differences
103 are fewer than those which applied to the 1.xx series of the emulator.
104 Some of the more important differences are listed below:
106 The Roundup flag does not have much meaning for the transcendental
110 In a few rare cases the Underflow flag obtained with the emulator will
111 be different from that obtained with an 80486. This occurs when the
113 (a) the operands have a higher precision than the current setting of the
115 (b) the underflow exception is masked.
116 (c) the magnitude of the exact result (before rounding) is less than 2^-16382.
117 (d) the magnitude of the final result (after rounding) is exactly 2^-16382.
118 (e) the magnitude of the exact result would be exactly 2^-16382 if the
119     operands were rounded to the current precision before the arithmetic
121 If all of these apply, the emulator will set the Underflow flag but a real
125 unsupported by the 80486. They are the Pseudo-NaNs, Pseudoinfinities,
126 and Unnormals. None of these will be generated by an 80486 or by the
128 detail from the way an 80486 does.
130 Self modifying code can cause the emulator to fail. An example of such
134 The FPU instruction may be (usually will be) loaded into the pre-fetch
135 queue of the CPU before the mov instruction is executed. If the
136 destination of the 'movl' overlaps the FPU instruction then the bytes
137 in the prefetch queue and memory will be inconsistent when the FPU
139 able to find the instruction which caused the device-not-present
140 exception. For this case, the emulator cannot emulate the behaviour of
143 Handling of the address size override prefix byte (0x67) has not been
148 check the addressing, and which runs successfully in real mode,
150 protection fault message when run under the MS-DOS prompt of Windows
155 write a few bytes below the lowest address of the stack.  The emulator
157 allowed to write outside the bounds set by the protection.
164 The speed of floating point computation with the emulator will depend
165 upon instruction mix. Relative performance is best for the instructions
167 affected by the FPU instruction trap overhead.
170 Timing: Some simple timing tests have been made on the emulator functions.
173 ms-dos, the next two columns are for emulators running with the djgpp
194 The performance under Linux is improved by the use of look-ahead code.
195 The following results show the improvement which is obtained under
196 Linux due to the look-ahead code. Also given are the times for the
197 original Linux emulator with the 4.1 'soft' lib.
199  [ Linus' note: I changed look-ahead to be the default under linux, as
221 progressively slower for most functions as more of the 80486 features
228 The accuracy of the emulator is in almost all cases equal to or better
231 The results of the basic arithmetic functions (+,-,*,/), and fsqrt
232 match those of an 80486 FPU. They are the best possible; the error for
237 The following table compares the emulator accuracy for the sqrt(),
238 trig and log functions against the Turbo C "emulator". For this table,
241 arguments greater than pi/4 can be thought of as being related to the
242 precision of the argument x; e.g. an argument of pi/2-(1e-10) which is
260 ** The accuracy for exp() and log() is low because the FPU (emulator)
264 The emulator passes the "paranoia" tests (compiled with gcc 2.3.3 or
268 properly performing FPU cannot pass the 'paranoia' tests for 'double'
271 The code for reducing the argument for the trig functions (fsin, fcos,
274 consequence, the accuracy of these functions for large arguments has
277 for operands close to pi/2. Measured results are (note that the
278 definition of accuracy has changed slightly from that used for the
289 give much degraded precision. For example, the integer number
291 is within about 10e-7 of a multiple of pi. To find the tan (for
294 emulator computes the result to about 42.6 bits precision (the correct
295 result is about -9.739715e-8). On the other hand, an 80486 FPU returns
299 pi/2) the emulator is more accurate than an 80486 FPU. For very large
300 arguments, the emulator is far more accurate.
303 Prior to version 1.20 of the emulator, the accuracy of the results for
304 the transcendental functions (in their principal range) was not as
305 good as the results from an 80486 FPU. From version 1.20, the accuracy
307 worst-case results which are better than the worst-case results given
310 The following table gives the measured results for the emulator. The
312 million.  The group of three columns gives the frequency of the given
313 accuracy in number of times per million, thus the second of these
316 The results show that the fsin, fcos and fptan instructions return
317 results which are in error (i.e. less accurate than the best possible
321 the worst accuracy which was found (in bits) and the approximate value
322 of the argument which produced it.
339 following table gives the results which were obtained with an AMD
341 identical results).  The tests were basically the same as those used
342 to measure the emulator (the values, being random, were in general not
343 the same).  The total number of tests for each instruction are given
344 at the end of the table, in case each about 100k tests were performed.
345 Another line of figures at the end of the table shows that most of the
347 percent of the arguments tested.
349 The numbers in the body of the table give the approx number of times a
350 result of the given accuracy in bits (given in the left-most column)
351 was obtained per one million arguments. For three of the instructions,
353 the number cases where the results of the first column were for a
355 results for positive arguments than it does for negative.  * In the
356 cases of fcos and fptan, the first column gives the results when all
357 cases where arguments greater than 1.5 were removed from the results
358 given in the second column. Unlike the emulator, an 80486 FPU returns
359 results of relatively poor accuracy for these instructions when the
360 argument approaches pi/2. The table does not show those cases when the
361 accuracy of the results were less than 62 bits, which occurs quite
362 often for fsin and fptan when the argument approaches pi/2. This poor
363 accuracy is discussed above in relation to the Turbo C "emulator", and
364 the accuracy of the value of pi.
397 A number of people have contributed to the development of the