Ok, so the original plan was to make mpi_inv_mod() the smallest block that
could not be divided. Updated plan is that the smallest block will be either:
- ecp_normalize_jac_many() (one mpi_inv_mod() + a number or mpi_mul_mpi()s)
- or the second loop in ecp_precompute_comb()
With default settings, the minimum non-restartable sequence is:
- for P-256: 222M
- for P-384: 341M
This is within a 2-3x factor of originally planned value of 120M. However,
that value can be approached, at the cost of some performance, by setting
ECP_WINDOW_SIZE (w below) lower than the default of 6. For example:
- w=4 -> 166M for any curve (perf. impact < 10%)
- w=2 -> 130M for any curve (perf. impact ~ 30%)
My opinion is that the current state with w=4 is a good compromise, and the
code complexity need to attain 120M is not warranted by the 1.4 factor between
that and the current minimum with w=4 (which is close to optimal perf).
We'll need to store MPIs and other things that allocate memory in this
context, so we need a place to free it. We can't rely on doing it before
returning from ecp_mul() as we might return MBEDTLS_ERR_ECP_IN_PROGRESS (thus
preserving the context) and never be called again (for example, TLS handshake
aborted for another reason). So, ecp_group_free() looks like a good place to
do this, if the restart context is part of struct ecp_group.
This means it's not possible to use the same ecp_group structure in different
threads concurrently, but:
- that's already the case (and documented) for other reasons
- this feature is precisely intended for environments that lack threading
An alternative option would be for the caller to have to allocate/free the
restart context and pass it explicitly, but this means creating new functions
that take a context argument, and putting a burden on the user.
The plan is to count basic operations as follows:
- call to ecp_add_mixed() -> 11
- call to ecp_double_jac() -> 8
- call to mpi_mul_mpi() -> 1
- call to mpi_inv_mod() -> 120
- everything else -> not counted
The counts for ecp_add_mixed() and ecp_double_jac() are based on the actual
number of calls to mpi_mul_mpi() they they make.
The count for mpi_inv_mod() is based on timing measurements on K64F and
LPC1768 boards, and are consistent with the usual very rough estimate of one
inversion = 100 multiplications. It could be useful to repeat that measurement
on a Cortex-M0 board as those have smaller divider and multipliers, so the
result could be a bit different but should be the same order of magnitude.
The documented limitation of 120 basic ops is due to the calls to mpi_inv_mod()
which are currently not interruptible nor planned to be so far.
As noted in #557, several functions use 'index' resp. 'time'
as parameter names in their declaration and/or definition, causing name
conflicts with the functions in the C standard library of the same
name some compilers warn about.
This commit renames the arguments accordingly.