1 changed files with 45 additions and 43 deletions
			
			
		- 
					88src/TUNING
| @ -1,50 +1,52 @@ | |||||
| Tips for performance tuning on a specific architecture: | Tips for performance tuning on a specific architecture: | ||||
| 
 | 
 | ||||
| 1. Choose the optimal limb size (intDsize). This is fundamental. On 32-bit |  | ||||
|    platforms intDsize=32 is best. On 64-bit platforms intDsize=64 may be |  | ||||
|    better, especially if there is a 64x64-bit multiplication in hardware. |  | ||||
|  | 1a. Choose the optimal digit size (intDsize). This is fundamental. On 32-bit | ||||
|  |     platforms intDsize=32 is best. On 64-bit platforms intDsize=64 may be | ||||
|  |     better, especially if there is a 64x64-bit multiplication in hardware. | ||||
| 
 | 
 | ||||
| 2. Tune GMP. |  | ||||
|  | 1b. Alternatively, tune GMP. When GMP is used, CLN's digit size (intDsize) has | ||||
|  |     to match GMP's limb size (sizeof(mp_limb_t)). There is nothing to do at the | ||||
|  |     CLN side: The configure script will take care of intDsize automatically. | ||||
| 
 | 
 | ||||
| 3. The break-even points between several algorithms for the same task |  | ||||
|    have to be determined experimentally, in the order given below: |  | ||||
|  | 2.  The break-even points between several algorithms for the same task | ||||
|  |     have to be determined experimentally, in the order given below: | ||||
| 
 | 
 | ||||
|    multiplication: |  | ||||
|      cl_DS_mul.cc          karatsuba_threshold |  | ||||
|      cl_DS_mul.cc          function cl_fftm_suitable |  | ||||
|    division: |  | ||||
|      cl_DS_div.cc          function cl_recip_suitable |  | ||||
|    2-adic reciprocal: |  | ||||
|      cl_2DS_recip.cc       recip2adic_threshold |  | ||||
|    2-adic division: |  | ||||
|      cl_2DS_div.cc         function cl_recip_suitable |  | ||||
|    square root: |  | ||||
|      cl_DS_sqrt.cc         function cl_recipsqrt_suitable |  | ||||
|      cl_LF_sqrt.cc         "if (len > ...)" |  | ||||
|    gcd: |  | ||||
|      cl_I_gcd.cc           cl_gcd_double_threshold |  | ||||
|    binary->decimal conversion: |  | ||||
|      cl_I_to_digits.cc     cl_digits_div_threshold |  | ||||
|    pi: |  | ||||
|      cl_LF_pi.cc           best of 4 algorithms |  | ||||
|    exp, log: |  | ||||
|      cl_F_expx.cc          factor limit_slope of isqrt(d) |  | ||||
|      cl_R_exp.cc           inside function exp |  | ||||
|      cl_R_ln.cc            inside function ln |  | ||||
|    eulerconst: |  | ||||
|      cl_LF_eulerconst.cc   function compute_eulerconst |  | ||||
|    sin, cos, sinh, cosh: |  | ||||
|      cl_F_sinx.cc          factor limit_slope of isqrt(d) |  | ||||
|      cl_R_sin.cc           inside function sin |  | ||||
|      cl_R_cos.cc           inside function cos |  | ||||
|      cl_R_cossin.cc        inside function cl_cos_sin |  | ||||
|      cl_F_sinhx.cc         factor limit_slope of isqrt(d) |  | ||||
|      cl_R_sinh.cc          inside function sinh |  | ||||
|      cl_R_cosh.cc          inside function cosh |  | ||||
|      cl_R_coshsinh.cc      inside function cl_cosh_sinh |  | ||||
|      cl_F_atanx.cc         factor limit_slope of isqrt(d) |  | ||||
|      cl_F_atanx.cc         inside function atanx |  | ||||
|      cl_F_atanhx.cc        factor limit_slope of isqrt(d) |  | ||||
|      cl_F_atanhx.cc        inside function atanhx |  | ||||
|  |     multiplication: | ||||
|  |       cl_DS_mul.cc          karatsuba_threshold | ||||
|  |       cl_DS_mul.cc          function cl_fftm_suitable | ||||
|  |     division: | ||||
|  |       cl_DS_div.cc          function cl_recip_suitable | ||||
|  |     2-adic reciprocal: | ||||
|  |       cl_2DS_recip.cc       recip2adic_threshold | ||||
|  |     2-adic division: | ||||
|  |       cl_2DS_div.cc         function cl_recip_suitable | ||||
|  |     square root: | ||||
|  |       cl_DS_sqrt.cc         function cl_recipsqrt_suitable | ||||
|  |       cl_LF_sqrt.cc         "if (len > ...)" | ||||
|  |     gcd: | ||||
|  |       cl_I_gcd.cc           cl_gcd_double_threshold | ||||
|  |     binary->decimal conversion: | ||||
|  |       cl_I_to_digits.cc     cl_digits_div_threshold | ||||
|  |     pi: | ||||
|  |       cl_LF_pi.cc           best of 4 algorithms | ||||
|  |     exp, log: | ||||
|  |       cl_F_expx.cc          factor limit_slope of isqrt(d) | ||||
|  |       cl_R_exp.cc           inside function exp | ||||
|  |       cl_R_ln.cc            inside function ln | ||||
|  |     eulerconst: | ||||
|  |       cl_LF_eulerconst.cc   function compute_eulerconst | ||||
|  |     sin, cos, sinh, cosh: | ||||
|  |       cl_F_sinx.cc          factor limit_slope of isqrt(d) | ||||
|  |       cl_R_sin.cc           inside function sin | ||||
|  |       cl_R_cos.cc           inside function cos | ||||
|  |       cl_R_cossin.cc        inside function cl_cos_sin | ||||
|  |       cl_F_sinhx.cc         factor limit_slope of isqrt(d) | ||||
|  |       cl_R_sinh.cc          inside function sinh | ||||
|  |       cl_R_cosh.cc          inside function cosh | ||||
|  |       cl_R_coshsinh.cc      inside function cl_cosh_sinh | ||||
|  |       cl_F_atanx.cc         factor limit_slope of isqrt(d) | ||||
|  |       cl_F_atanx.cc         inside function atanx | ||||
|  |       cl_F_atanhx.cc        factor limit_slope of isqrt(d) | ||||
|  |       cl_F_atanhx.cc        inside function atanhx | ||||
| 
 | 
 | ||||
| 
 | 
 | ||||
						Write
						Preview
					
					
					Loading…
					
					Cancel
						Save
					
		Reference in new issue