Fixed a bug that the little-endian Microblaze does not work when MBEDTLS_HAVE_ASM is defined.
Signed-off-by: Kazuyuki Kimura <kim@wing.ocn.ne.jp>
Signed-off-by: Dave Rodgman <dave.rodgman@arm.com>
Fixes#1910
With ebx added to the MULADDC_STOP clobber list to fix#1550, the inline
assembly fails to build with GCC < 5 in PIC mode with the following error:
include/mbedtls/bn_mul.h:46:13: error: PIC register clobbered by ‘ebx’ in ‘asm’
This is because older GCC versions treated the x86 ebx register (which is
used for the GOT) as a fixed reserved register when building as PIC.
This is fixed by an improved register allocator in GCC 5+. From the release
notes:
Register allocation improvements: Reuse of the PIC hard register, instead of
using a fixed register, was implemented on x86/x86-64 targets. This
improves generated PIC code performance as more hard registers can be used.
https://www.gnu.org/software/gcc/gcc-5/changes.html
As a workaround, detect this situation and disable the inline assembly,
similar to the MULADDC_CANNOT_USE_R7 logic.
Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
Add memory constraints to the aarch64 inline assembly in MULADDC_STOP.
This fixes an issue where Clang 12 and 13 were generating
non-functional code on aarch64 platforms. See #4962, #4943
for further details.
Signed-off-by: David Horstmann <david.horstmann@arm.com>
MULADDC_CORE reads from (%%rsi) and writes to (%%rdi). This fragment is
repeated up to 16 times, and %%rsi and %%rdi are s and d on entry
respectively. Hence the complete asm statement reads 16 64-bit words
from memory starting at s, and writes 16 64-bit words starting at d.
Without any declaration of modified memory, Clang 12 and Clang 13 generated
non-working code for mbedtls_mpi_mod_exp. The constraints make the unit
tests pass with Clang 12.
Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
These macros were moved into a header and now check-names.sh is failing.
Add an MBEDTLS_ prefix to the macro names to make it pass.
Signed-off-by: Janos Follath <janos.follath@arm.com>
As a result, the copyright of contributors other than Arm is now
acknowledged, and the years of publishing are no longer tracked in the
source files.
Also remove the now-redundant lines declaring that the files are part of
MbedTLS.
This commit was generated using the following script:
# ========================
#!/bin/sh
# Find files
find '(' -path './.git' -o -path './3rdparty' ')' -prune -o -type f -print | xargs sed -bi '
# Replace copyright attribution line
s/Copyright.*Arm.*/Copyright The Mbed TLS Contributors/I
# Remove redundant declaration and the preceding line
$!N
/This file is part of Mbed TLS/Id
P
D
'
# ========================
Signed-off-by: Bence Szépkúti <bence.szepkuti@arm.com>
x0-x3 are skipped such that function parameters to not have to be moved.
MULADDC_INIT and MULADDC_STOP are mostly empty because it is more
efficient to keep everything in registers (and that should easily be
possible). I considered a MULADDC_HUIT implementation, but could not
think of something that would be more efficient than basically 8
consecutive MULADDC_CORE. You could combine the loads and stores, but
it's probably more efficient to interleave them with arithmetic,
depending on the specific microarchitecture. NEON allows to do a
64x64->128 bit multiplication (and optional accumulation) in one
instruction, but is not great at handling carries.
Resolve conflicts by performing the following actions:
- Reject changes to ChangeLog, as Mbed Crypto doesn't have one
- Reject changes to tests/compat.sh, as Mbed Crypto doesn't have it
- Reject changes to programs/fuzz/onefile.c, as Mbed Crypto doesn't have
it
- Resolve minor whitespace differences in library/ecdsa.c by taking the
version from Mbed TLS upstream.
* origin/development:
Honor MBEDTLS_CONFIG_FILE in fuzz tests
Test that a shared library build produces a dynamically linked executable
Test that the shared library build with CMake works
Add a test of MBEDTLS_CONFIG_FILE
Exclude DTLS 1.2 only with older OpenSSL
Document the rationale for the armel build
Switch armel build to -Os
Add a build on ARMv5TE in ARM mode
Add changelog entry for ARM assembly fix
bn_mul.h: require at least ARMv6 to enable the ARM DSP code
Adapt ChangeLog
ECP restart: Don't calculate address of sub ctx if ctx is NULL
Commit 16b1bd89326e "bn_mul.h: add ARM DSP optimized MULADDC code"
added some ARM DSP instructions that was assumed to always be available
when __ARM_FEATURE_DSP is defined to 1. Unfortunately it appears that
the ARMv5TE architecture (GCC flag -march=armv5te) supports the DSP
instructions, but only in Thumb mode and not in ARM mode, despite
defining __ARM_FEATURE_DSP in both cases.
This patch fixes the build issue by requiring at least ARMv6 in addition
to the DSP feature.
To help the build system find the correct include files, paths starting
with "mbedtls/" or "psa/" must be used. Otherwise, you can run into
build failures like the following when building Mbed Crypto as a
submodule.
In file included from chachapoly.c:31:0:
../../include/mbedtls/chachapoly.h:43:10: fatal error: poly1305.h: No such file or directory
#include "poly1305.h"
^~~~~~~~~~~~
compilation terminated.
Includes for ALT implementations are not modified, as the alt headers
are provided by system integrators and not Mbed TLS or Mbed Crypto.
Add inclusion to configration file in header files,
instead of relying on other header files to include
the configuration file. This issue resolves#1371
The Cortex M4, M7 MCUs and the Cortex A CPUs support the ARM DSP
instructions, and especially the umaal instruction which greatly
speed up MULADDC code. In addition the patch switched the ASM
constraints to registers instead of memory, giving the opportunity
for the compiler to load them the best way.
The speed improvement is variable depending on the crypto operation
and the CPU. Here are the results on a Cortex M4, a Cortex M7 and a
Cortex A8. All tests have been done with GCC 6.3 using -O2. RSA uses a
RSA-4096 key. ECDSA uses a secp256r1 curve EC key pair.
+--------+--------+--------+
| M4 | M7 | A8 |
+----------------+--------+--------+--------+
| ECDSA signing | +6.3% | +7.9% | +4.1% |
+----------------+--------+--------+--------+
| RSA signing | +43.7% | +68.3% | +26.3% |
+----------------+--------+--------+--------+
| RSA encryption | +3.4% | +9.7% | +3.6% |
+----------------+--------+--------+--------+
| RSA decryption | +43.0% | +67.8% | +22.8% |
+----------------+--------+--------+--------+
I ran the whole testsuite on the Cortex A8 Linux environment, and it
all passes.
Remove the trailing whitespace from the inline assembly for AMD64 target, to
overcome a warning in Clang, which was objecting to the string literal
generated by the inline assembly being greater than 4096 characters specified
by the ISO C99 standard. (-Woverlength-strings)
This is a cosmetic change and doesn't change the logic of the code in any way.
This change only fixes the problem for AMD64 target, and leaves other targets as
they are.
Fixes#482.
Yotta is no longer supported by Mbed TLS, so has been removed. Specifically, the
following changes have been made:
* references to yotta have been removed from the main readme and build
instructions
* the yotta module directory and build script has been removed
* yotta has been removed from test scripts such as all.sh and check-names.sh
* yotta has been removed from other files that that referenced it such as the
doxyfile and the bn_mul.h header
* yotta specific configurations and references have been removed from config.h
We don't compile in the assembly code if compiler optimisations are disabled as
the number of registers used in the assembly code doesn't work with the -O0
option. Also anyone select -O0 probably doesn't want to compile in the assembly
code anyway.
This fix adds the ebx register to the clobber list for the i386 inline assembly
for the multiply helper function.
ebx was used but not listed, so when the compiler chose to also use it, ebx was
getting corrupted. I'm surprised this wasn't spotted sooner.
Fixes Github issues #1550.
On x32, pointers are only 4-bytes wide and need to be loaded using the "movl"
instruction instead of "movq" to avoid loading garbage into the register.
The MULADDC routines for x86-64 are adjusted to work on x32 as well by getting
gcc to load all the registers for us in advance (and storing them later) by
using better register constraints. The b, c, D and S constraints correspond to
the rbx, rcx, rdi and rsi registers respectively.