You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
50 lines
1.9 KiB
50 lines
1.9 KiB
Tips for performance tuning on a specific architecture:
|
|
|
|
1. Choose the optimal limb size (intDsize). This is fundamental. On 32-bit
|
|
platforms intDsize=32 is best. On 64-bit platforms intDsize=64 may be
|
|
better, especially if there is a 64x64-bit multiplication in hardware.
|
|
|
|
2. Tune GMP.
|
|
|
|
3. The break-even points between several algorithms for the same task
|
|
have to be determined experimentally, in the order given below:
|
|
|
|
multiplication:
|
|
cl_DS_mul.cc karatsuba_threshold
|
|
cl_DS_mul.cc function cl_fftm_suitable
|
|
division:
|
|
cl_DS_div.cc function cl_recip_suitable
|
|
2-adic reciprocal:
|
|
cl_2DS_recip.cc recip2adic_threshold
|
|
2-adic division:
|
|
cl_2DS_div.cc function cl_recip_suitable
|
|
square root:
|
|
cl_DS_sqrt.cc function cl_recipsqrt_suitable
|
|
cl_LF_sqrt.cc "if (len > ...)"
|
|
gcd:
|
|
cl_I_gcd.cc cl_gcd_double_threshold
|
|
binary->decimal conversion:
|
|
cl_I_to_digits.cc cl_digits_div_threshold
|
|
pi:
|
|
cl_LF_pi.cc best of 4 algorithms
|
|
exp, log:
|
|
cl_F_expx.cc factor limit_slope of isqrt(d)
|
|
cl_R_exp.cc inside function exp
|
|
cl_R_ln.cc inside function ln
|
|
eulerconst:
|
|
cl_LF_eulerconst.cc function compute_eulerconst
|
|
sin, cos, sinh, cosh:
|
|
cl_F_sinx.cc factor limit_slope of isqrt(d)
|
|
cl_R_sin.cc inside function sin
|
|
cl_R_cos.cc inside function cos
|
|
cl_R_cossin.cc inside function cl_cos_sin
|
|
cl_F_sinhx.cc factor limit_slope of isqrt(d)
|
|
cl_R_sinh.cc inside function sinh
|
|
cl_R_cosh.cc inside function cosh
|
|
cl_R_coshsinh.cc inside function cl_cosh_sinh
|
|
cl_F_atanx.cc factor limit_slope of isqrt(d)
|
|
cl_F_atanx.cc inside function atanx
|
|
cl_F_atanhx.cc factor limit_slope of isqrt(d)
|
|
cl_F_atanhx.cc inside function atanhx
|
|
|
|
|