Richard Kreckel
17 years ago
1 changed files with 45 additions and 43 deletions
-
88src/TUNING
@ -1,50 +1,52 @@ |
|||||
Tips for performance tuning on a specific architecture: |
Tips for performance tuning on a specific architecture: |
||||
|
|
||||
1. Choose the optimal limb size (intDsize). This is fundamental. On 32-bit |
|
||||
platforms intDsize=32 is best. On 64-bit platforms intDsize=64 may be |
|
||||
better, especially if there is a 64x64-bit multiplication in hardware. |
|
||||
|
1a. Choose the optimal digit size (intDsize). This is fundamental. On 32-bit |
||||
|
platforms intDsize=32 is best. On 64-bit platforms intDsize=64 may be |
||||
|
better, especially if there is a 64x64-bit multiplication in hardware. |
||||
|
|
||||
2. Tune GMP. |
|
||||
|
1b. Alternatively, tune GMP. When GMP is used, CLN's digit size (intDsize) has |
||||
|
to match GMP's limb size (sizeof(mp_limb_t)). There is nothing to do at the |
||||
|
CLN side: The configure script will take care of intDsize automatically. |
||||
|
|
||||
3. The break-even points between several algorithms for the same task |
|
||||
have to be determined experimentally, in the order given below: |
|
||||
|
2. The break-even points between several algorithms for the same task |
||||
|
have to be determined experimentally, in the order given below: |
||||
|
|
||||
multiplication: |
|
||||
cl_DS_mul.cc karatsuba_threshold |
|
||||
cl_DS_mul.cc function cl_fftm_suitable |
|
||||
division: |
|
||||
cl_DS_div.cc function cl_recip_suitable |
|
||||
2-adic reciprocal: |
|
||||
cl_2DS_recip.cc recip2adic_threshold |
|
||||
2-adic division: |
|
||||
cl_2DS_div.cc function cl_recip_suitable |
|
||||
square root: |
|
||||
cl_DS_sqrt.cc function cl_recipsqrt_suitable |
|
||||
cl_LF_sqrt.cc "if (len > ...)" |
|
||||
gcd: |
|
||||
cl_I_gcd.cc cl_gcd_double_threshold |
|
||||
binary->decimal conversion: |
|
||||
cl_I_to_digits.cc cl_digits_div_threshold |
|
||||
pi: |
|
||||
cl_LF_pi.cc best of 4 algorithms |
|
||||
exp, log: |
|
||||
cl_F_expx.cc factor limit_slope of isqrt(d) |
|
||||
cl_R_exp.cc inside function exp |
|
||||
cl_R_ln.cc inside function ln |
|
||||
eulerconst: |
|
||||
cl_LF_eulerconst.cc function compute_eulerconst |
|
||||
sin, cos, sinh, cosh: |
|
||||
cl_F_sinx.cc factor limit_slope of isqrt(d) |
|
||||
cl_R_sin.cc inside function sin |
|
||||
cl_R_cos.cc inside function cos |
|
||||
cl_R_cossin.cc inside function cl_cos_sin |
|
||||
cl_F_sinhx.cc factor limit_slope of isqrt(d) |
|
||||
cl_R_sinh.cc inside function sinh |
|
||||
cl_R_cosh.cc inside function cosh |
|
||||
cl_R_coshsinh.cc inside function cl_cosh_sinh |
|
||||
cl_F_atanx.cc factor limit_slope of isqrt(d) |
|
||||
cl_F_atanx.cc inside function atanx |
|
||||
cl_F_atanhx.cc factor limit_slope of isqrt(d) |
|
||||
cl_F_atanhx.cc inside function atanhx |
|
||||
|
multiplication: |
||||
|
cl_DS_mul.cc karatsuba_threshold |
||||
|
cl_DS_mul.cc function cl_fftm_suitable |
||||
|
division: |
||||
|
cl_DS_div.cc function cl_recip_suitable |
||||
|
2-adic reciprocal: |
||||
|
cl_2DS_recip.cc recip2adic_threshold |
||||
|
2-adic division: |
||||
|
cl_2DS_div.cc function cl_recip_suitable |
||||
|
square root: |
||||
|
cl_DS_sqrt.cc function cl_recipsqrt_suitable |
||||
|
cl_LF_sqrt.cc "if (len > ...)" |
||||
|
gcd: |
||||
|
cl_I_gcd.cc cl_gcd_double_threshold |
||||
|
binary->decimal conversion: |
||||
|
cl_I_to_digits.cc cl_digits_div_threshold |
||||
|
pi: |
||||
|
cl_LF_pi.cc best of 4 algorithms |
||||
|
exp, log: |
||||
|
cl_F_expx.cc factor limit_slope of isqrt(d) |
||||
|
cl_R_exp.cc inside function exp |
||||
|
cl_R_ln.cc inside function ln |
||||
|
eulerconst: |
||||
|
cl_LF_eulerconst.cc function compute_eulerconst |
||||
|
sin, cos, sinh, cosh: |
||||
|
cl_F_sinx.cc factor limit_slope of isqrt(d) |
||||
|
cl_R_sin.cc inside function sin |
||||
|
cl_R_cos.cc inside function cos |
||||
|
cl_R_cossin.cc inside function cl_cos_sin |
||||
|
cl_F_sinhx.cc factor limit_slope of isqrt(d) |
||||
|
cl_R_sinh.cc inside function sinh |
||||
|
cl_R_cosh.cc inside function cosh |
||||
|
cl_R_coshsinh.cc inside function cl_cosh_sinh |
||||
|
cl_F_atanx.cc factor limit_slope of isqrt(d) |
||||
|
cl_F_atanx.cc inside function atanx |
||||
|
cl_F_atanhx.cc factor limit_slope of isqrt(d) |
||||
|
cl_F_atanhx.cc inside function atanhx |
||||
|
|
||||
|
|
Write
Preview
Loading…
Cancel
Save
Reference in new issue