Skip to content

Quadratic sieve tuning#2712

Merged
fredrik-johansson merged 1 commit into
flintlib:mainfrom
fredrik-johansson:qsieve
Jun 7, 2026
Merged

Quadratic sieve tuning#2712
fredrik-johansson merged 1 commit into
flintlib:mainfrom
fredrik-johansson:qsieve

Conversation

@fredrik-johansson

@fredrik-johansson fredrik-johansson commented Jun 7, 2026

Copy link
Copy Markdown
Collaborator

Summary:

  • We add a tuning program for the quadratic sieve (qsieve/tune/tune-qsieve). This takes a specific bit size as input, starts with the current tuning parameters for this bit size, and does a local search for improvements. The tuning program was coded with Claude.

  • To simplify the tuning implementation, we add the function qsieve_factor_with_tune which takes tuning parameters as input (qsieve_factor is just a wrapper around this which looks up default parameters).

  • Update the default tuning table up to 160 bits with parameters found with tune-qsieve. The old tuning values were highly suboptimal; the new ones essentially speed up fmpz_factor by a factor two below 128 bits.

  • Another minor speedup to qsieve_factor for small factorisations: avoiding an unnecessary memset.

Some notes about the tuning process:

  • In practice, it seems that there are many local optima, and the tuning program will typically arrive at very different final parameters when started from very different initial values. Finding global optima could be quite expensive as there are five independent parameters. Fortunately, different local optima seem to be pretty close to each other in performance. You may note that the new tuning table has a sharp discontinuity in values at the crossover point between the old and new tuning values (160 bits), but either set of parameters yields essentially the same performance (within 10%) at the crossover point.

  • I didn't continue tuning for larger bit sizes since this starts to take a lot of time. Also, from this size on further tuning should probably distinguish between single-threaded and multi-threaded use (the current values seem OK for either).

Performance benchmark, time to factor N random semiprimes of the given bit size with fmpz_factor:

N = 1000
                        old        new    speedup
      75 bits         2.479      2.479    1.000
      76 bits         3.676      3.674    1.001
      77 bits         3.675      3.696    0.994
      78 bits         5.466      2.943    1.857
      79 bits         5.255      2.939    1.788
      80 bits          8.13       3.17    2.565
      81 bits         7.194      3.107    2.315
      82 bits         8.622      3.524    2.447
      83 bits         8.676      3.531    2.457
      84 bits        10.094      3.604    2.801
      85 bits         9.937      3.541    2.806
      86 bits         9.061      2.513    3.606
      87 bits         9.256       2.64    3.506
      88 bits         7.995      2.571    3.110
      89 bits         8.209      2.564    3.202
      90 bits         7.398      2.735    2.705
      91 bits         7.465       2.74    2.724
      92 bits         7.422      2.967    2.502
      93 bits         7.377      2.956    2.496
      94 bits         8.013      3.549    2.258
      95 bits         8.013      3.204    2.501
      96 bits          8.72      4.442    1.963
      97 bits         8.682      3.756    2.312
      98 bits         9.576      5.738    1.669
      99 bits         9.569      5.557    1.722
     100 bits         9.407      4.941    1.904
     101 bits         9.377      4.958    1.891
     102 bits         9.524       4.69    2.031
     103 bits         9.611        4.7    2.045
     104 bits        10.494      5.158    2.035
     105 bits        10.468      5.129    2.041
     106 bits        11.253       5.52    2.039
     107 bits        11.336      5.563    2.038
     108 bits         12.33      6.213    1.985
     109 bits         12.38      6.245    1.982
     110 bits        13.587      7.084    1.918
     111 bits         13.62      7.104    1.917
     112 bits        14.987       7.86    1.907
     113 bits        15.017      7.849    1.913
     114 bits        16.863      8.724    1.933
     115 bits        16.966      8.741    1.941
     116 bits        19.103      9.365    2.040
     117 bits        18.971      9.342    2.031
     118 bits        21.398     10.559    2.027
N = 100
     119 bits         2.083      1.041    2.001
     120 bits         2.098       1.24    1.692
     121 bits          2.08      1.234    1.686
     122 bits         2.164      1.402    1.544
     123 bits         2.125      1.388    1.531
     124 bits         2.565      1.619    1.584
     125 bits         2.602      1.671    1.557
     126 bits         2.866      1.883    1.522
     127 bits         2.821      1.828    1.543
     128 bits          3.13      2.335    1.340
     129 bits         3.051      2.216    1.377
     130 bits         3.004      2.486    1.208
     131 bits         2.983      2.418    1.234
     132 bits         2.933      2.582    1.136
     133 bits         2.903      2.581    1.125
     134 bits         3.371      3.043    1.108
     135 bits         3.416      3.086    1.107
     136 bits         3.839      3.125    1.228
     137 bits         3.824      3.134    1.220
     138 bits         4.453      3.772    1.181
     139 bits         4.552      3.918    1.162
     140 bits         4.866      4.211    1.156
     141 bits         4.897      4.211    1.163
     142 bits         5.283      4.674    1.130
     143 bits         5.299      4.716    1.124
     144 bits         6.026      5.354    1.126
     145 bits         5.824      5.317    1.095
     146 bits         7.005      5.619    1.247
     147 bits          6.97      6.547    1.065
     148 bits         8.115      6.306    1.287
     149 bits         8.326      6.445    1.292
     150 bits         9.076      7.471    1.215
     151 bits         9.103      7.865    1.157
     152 bits         9.879      8.813    1.121
     153 bits         9.704      8.842    1.097
     154 bits        11.052     11.014    1.003
N = 10
     155 bits         1.107      1.085    1.020
     156 bits         1.312      1.205    1.089
     157 bits         1.296      1.128    1.149
     158 bits         1.436      1.283    1.119
     159 bits         1.541      1.354    1.138
     160 bits         1.699       1.62    1.049
     161 bits         1.826      1.783    1.024
     162 bits         2.244      2.241    1.001
     163 bits         2.136      2.127    1.004
     164 bits         2.323      2.314    1.004
     165 bits         2.395      2.391    1.002

@fredrik-johansson fredrik-johansson merged commit 010d93b into flintlib:main Jun 7, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant