A formalization and analysis of the security model established by the ppad-secp256k1 library.
ppad-secp256k1 is a Haskell library supporting primitives and digital signature schemes on the elliptic curve secp256k1, in the spirit of the well-known libsecp256k1 or noble-secp256k1 libraries. Its focus is on achieving the maximum security, speed, and ergonomics of use possible in the context of a garbage-collected language with an optimizing compiler.
This analysis builds on those performed for ppad-fixed, ppad-sha{256,512}, and ppad-hmac-drbg. Please refer to those for additional explanatory details on methodology, etc.
This analysis assumes the same adversary as was supposed for ppad-fixed, viz. one capable of:
As a purely software implementation, a trusted execution environment at the physical layer is assumed; considered out-of-scope are adversaries who can perform attacks over power or electromagnetic emanation-related side channels, speculative execution attacks, and so on.
Of note: the final capability granted to the adversary in the analysis for ppad-hmac-drbg, namely the ability to read arbitrary post-hoc heap state, is withheld here. That capability was granted specifically in that analysis to make guarantees about the effectiveness of wiping DRBG internal state, but an adversary who could read anything that was ever allocated on the heap would trivially be able to recover secret key material in this setting, since the 256-bit secret keys used here are, in general, boxed and heap-allocated.
The security property to be demonstrated here is the same as was defended in the analysis for ppad-fixed, adapted to the secp256k1 setting: namely that elliptic curve arithmetic, digital signature construction, and Diffie-Hellman operations run in constant time unless marked otherwise.
(N.b., in some situations, e.g. in signature verification, only public data is operated on, and so the use of variable-time EC scalar multiplication is perfectly acceptable.)
Again, this property can be stated formally: the observable execution trace t(f, x) procured by an attacker when evaluating a sensitive function f on inputs x, where the trace consists of the sequence of instructions executed, memory addresses accessed, and the size and count of allocations made, must be indistinguishable with respect to secret inputs. For secret inputs s0 and s1, and any public input p, the claim is, up to renaming of base pointers and stack addresses, and events of negligible probability, that:
t(f, (s0, p)) == t(f, (s1, p)).
The analysis performed here is specifically for ppad-secp256k1 v0.5.4, compiled with optimizations using the LLVM backend on GHC 9.10.3 and LLVM 19.1.7, targeting aarch64-darwin.
To be examined are the elliptic curve primitives, signature construction functions, and elliptic curve Diffie-Hellman function, first by way of representation and source examination, and then by LLM-assisted STG and assembly analysis. Empirical benchmarks and allocation measurements will then be shown to be unable to refute the constant-time claim.
Note that whilst the security property to be defended is stated formally, this analysis purposefully argues for it as more of an “exceptionally rigorous sanity check” than an exhaustive proof, which would require considerably more effort to construct.
Points on secp256k1 are represented in terms of both affine and projective coordinates, where individual coordinates in both cases are 256-bit Montgomery-form words provided by ppad-fixed. Each is boxed and heap-allocated:
-- curve point, affine coordinates
data Affine = Affine !C.Montgomery !C.Montgomery
deriving stock (Show, Generic)
-- curve point, projective coordinates
data Projective = Projective {
px :: !C.Montgomery
, py :: !C.Montgomery
, pz :: !C.Montgomery
}
deriving stock (Show, Generic)
The functions in the library’s exported API only work on these boxed point representations, but, to avoid heap allocation, most internal computations are carried out on raw unboxed tuples:
-- Unboxed Wider/Montgomery synonym.
type Limb4 = (# Limb, Limb, Limb, Limb #)
-- Unboxed Projective synonym.
type Proj = (# Limb4, Limb4, Limb4 #)
At the algorithmic level, elliptic curve primitives are carefully
written so as not to depend on inputs, unless marked otherwise. Point
addition is carried out in terms of “Algorithm 7” of Renes et al., 2015,
which runs in constant time, and scalar multiplication is implemented
using a straightforward constant-time double-and-add algorithm. A
constant-time 'select_proj#' function is the CT workhorse
used in the latter, itself wrapping constant-time utilities supplied by
ppad-fixed:
select_proj# :: Proj -> Proj -> CT.Choice -> Proj
select_proj# (# ax, ay, az #) (# bx, by, bz #) c =
(# W.select# ax bx c, W.select# ay by c, W.select# az bz c #)
{-# INLINE select_proj# #-}
The library also supports wNAF windowed scalar multiplication, which involves using a precomputed “context” containing a number of multiples of the secp256k1 generator point as a lookup table. Its implementation, which is relatively complex, requires particular care, as we require that memory access patterns not depend on secrets. It makes use of the following low-level indexing utility that ensures that lookups in the provided ByteArray context are performed in constant time, and with input-independent access:
-- Constant-time table lookup within a window.
--
-- Unconditionally scans all entries from 'base' to 'base + size - 1',
-- selecting the one where 'index' equals 'target'.
ct_index_proj#
:: ByteArray
-> Exts.Word# -- ^ base index
-> Exts.Word# -- ^ size of window
-> Exts.Word# -- ^ target index
-> Proj
ct_index_proj# arr base size target = loop 0## (# Z, Z, Z #) where
loop i acc
| Exts.isTrue# (i `Exts.geWord#` size) = acc
| otherwise =
let !idx = Exts.plusWord# base i
!pt = index_proj# arr (Exts.word2Int# idx)
!eq = CT.from_word_eq# idx target
!nacc = select_proj# acc pt eq
in loop (Exts.plusWord# i 1##) nacc
{-# INLINE ct_index_proj# #-}
In the same way as was described in the analysis for ppad-fixed, the STG and ASM can be audited to verify that no variable time patterns or instructions are present in the IR or compile targets. The compiled STG and ASM for this library are both very large and complex (the STG is 100k lines, and the ASM over 200k), and so are productively examined via an LLM with manual spot checks. A pass made via Opus 4.5 produces the following, for example:
Function: add_proj
STG Pattern: No secret-dependent branching
ASM Pattern: Arithmetic only
Status: CT
────────────────────────────────────────
Function: double
STG Pattern: No secret-dependent branching
ASM Pattern: Arithmetic only
Status: CT
────────────────────────────────────────
Function: neg
STG Pattern: No secret-dependent branching
ASM Pattern: Arithmetic only
Status: CT
────────────────────────────────────────
Function: mul
STG Pattern: xor#/and#/xor# branchless select
ASM Pattern: csel, cset, csinc
Status: CT
────────────────────────────────────────
Function: mul_wnaf
STG Pattern: Branchless CT table lookup via ct_index_proj#
ASM Pattern: Masked indexing
Status: CT
The ‘mul’ STG, for example, is approximately 5500 lines of very heavily inlined code, but is easily seen to match the claim, consisting almost entirely of expressions like the following:
[..]
case plusWord2# [s469 sat] of wild781 {
(#,#) _ s470 ->
case mul# s464 s466 s468 s470 s429 s431 s433 s435 of y3b {
(#,,,#) ipv174 ipv175 ipv176 ipv177 ->
case plusWord2# [ipv168 ipv174] of wild782 {
(#,#) c471 s471 ->
case plusWord2# [ipv169 ipv175] of wild783 {
(#,#) c472 s472 ->
case plusWord2# [s472 c471] of wild784 {
(#,#) c473 s473 ->
case plusWord2# [ipv170 ipv176] of wild785 {
(#,#) c474 s474 ->
case or# [c472 c473] of sat {
__DEFAULT ->
case plusWord2# [s474 sat] of wild786 {
(#,#) c475 s475 ->
case plusWord2# [ipv171 ipv177] of wild787 {
(#,#) c476 s476 ->
case or# [c474 c475] of sat {
[..]
Its CT selection machinery in terms of bitwise XOR/AND operations is also clearly visible, though large and inlined:
[..]
case or# [sat sat] of sat {
__DEFAULT ->
case xor# [s351 us] of sat {
__DEFAULT ->
case and# [ipv182 sat] of sat {
__DEFAULT ->
case xor# [s351 sat] of sat {
__DEFAULT ->
case xor# [s349 us] of sat {
__DEFAULT ->
case and# [ipv182 sat] of sat {
__DEFAULT ->
[..]
The same patterns are visible in the aarch64 assembly, where
branchless selection occurs via csel/cset/eor/and sequences. On the
other hand, Opus 4.5 is able to catch that variable-time scalar
multiplication, for example, does not run in constant time
based on its ASM, pointing out the presence of the ‘cbz’ (for compare and branch
on zero) and ‘tbz’ (for test bit and branch
if zero) instructions in the following loop body of
'mul_vartime':
__blk_cGeg$def: ; @"_blk_cGeg$def"
; %bb.0: ; %nGQY
stp x22, x20, [sp]
ldp x8, x9, [x20, #192]
str x8, [sp, #88]
ldp x10, x11, [x20, #208]
orr x10, x10, x11
orr x8, x8, x9
orr x8, x8, x10
cbz x8, LBB905_3
; %bb.1: ; %cGex
ldr x11, [x20, #96]
ldr x8, [sp, #8]
ldp x9, x10, [x8, #104]
stp x9, x11, [sp, #72]
ldr x9, [x8, #120]
stp x9, x10, [sp, #56]
ldrb w9, [sp, #88]
tbz w9, #0, LBB905_4
Opus 4.5 also audits the 'select_proj' primitive as
running in constant time:
Function: select_proj
STG Pattern: Calls W.select#
ASM Pattern: Branchless via ppad-fixed
Status: CT
ppad-secp256k1 supports both BIP340 Schnorr signatures and RFC6979 ECDSA (the latter with support for both unrestricted and low-S signatures). Hashing functionality is provided by ppad-sha256, and DRBG functionality (for deterministic nonce generation) is provided by ppad-hmac-drbg.
Schnorr signature creation is implemented via several wrappers over the following core:
_sign_schnorr
:: (Wider -> Maybe Projective) -- partially-applied multiplication function
-> Wider -- secret key
-> BS.ByteString -- message
-> BS.ByteString -- 32 bytes of auxiliary random data
-> Maybe BS.ByteString
_sign_schnorr _mul _SECRET m a = do
p <- _mul _SECRET
let Affine (C.retr -> x_p) (C.retr -> y_p) = affine p
s = S.to _SECRET
d = S.select s (negate s) (W.odd y_p)
bytes_d = unroll32 (S.retr d)
bytes_p = unroll32 x_p
t = xor bytes_d (hash_aux a)
rand = hash_nonce (t <> bytes_p <> m)
k' = S.to (unsafe_roll32 rand)
guard (not (S.eq_vartime k' 0)) -- negligible probability
pt <- _mul (S.retr k')
let Affine (C.retr -> x_r) (C.retr -> y_r) = affine pt
k = S.select k' (negate k') (W.odd y_r)
bytes_r = unroll32 x_r
rand' = hash_challenge (bytes_r <> bytes_p <> m)
e = S.to (unsafe_roll32 rand')
bytes_ked = unroll32 (S.retr (k + e * d))
sig = bytes_r <> bytes_ked
pure $! sig
{-# INLINE _sign_schnorr #-}
Note that secret keys are represented as four-limb, 256-bit ‘Wider’
values from ppad-fixed. The implementation is straightforward,
following the BIP340 spec, and uses the selection primitive ‘S.select’
from ppad-fixed for constant-time operations. The ‘unroll32’
helper used to convert to a bytestring representation is implemented by
way of a fixed, unrolled loop that runs in constant time, and
'unsafe_roll32' performs the inverse operation in the same
fashion. Note that a variable-time equality check is performed only for
a negligible-probability degenerate case that does not occur in practice
(equivalent to producing an all-zero 256-bit hash).
ECDSA signature creation similarly proceeds like so:
_sign_ecdsa
:: (Wider -> Maybe Projective) -- partially-applied multiplication function
-> SigType
-> HashFlag
-> Wider
-> BS.ByteString
-> Maybe ECDSA
_sign_ecdsa _mul ty hf _SECRET m = runST $ do
-- RFC6979 sec 3.3a
let entropy = int2octets _SECRET
nonce = bits2octets h
drbg <- DRBG.new entropy nonce mempty
-- RFC6979 sec 2.4
sign_loop drbg
where
d = S.to _SECRET
hm = S.to (bits2int h)
h = case hf of
Hash -> SHA256.hash m
NoHash -> m
sign_loop g = do
k <- gen_k g
let mpair = do
kg <- _mul k
let Affine (S.to . C.retr -> r) _ = affine kg
ki = S.inv (S.to k)
s = (hm + d * r) * ki
pure $! (S.retr r, S.retr s)
case mpair of
Nothing -> do
DRBG.wipe g
pure Nothing
Just (r, s)
| W.eq_vartime r 0 -> sign_loop g -- negligible probability
| otherwise -> do
DRBG.wipe g
let !sig = Just $! ECDSA r s
pure $ case ty of
Unrestricted -> sig
LowS -> fmap low sig
{-# INLINE _sign_ecdsa #-}
-- RFC6979 sec 3.3b
gen_k :: DRBG.DRBG s -> ST s Wider
gen_k g = loop g where
loop drbg = do
bytes <- DRBG.gen drbg mempty (fi _CURVE_Q_BYTES)
case bytes of
Left {} -> error "ppad-secp256k1: internal error (please report a bug!)"
Right bs -> do
let can = bits2int bs
case W.cmp_vartime can _CURVE_Q of
LT -> pure can
_ -> loop drbg -- 2 ^ -128 probability
{-# INLINE gen_k #-}
This is standard RFC6979 ECDSA, with timing guarantees provided primarily by way of ppad-fixed’s constant-time field arithmetic and the elliptic curve primitives described previously. The RFC6979 helpers ‘bits2int’, ‘bits2octets’ and ‘int2octets’ are all implemented as fixed-schedule unrolled loops, and note again that variable-time operations are only used in negligible-probability degenerate cases that do not occur in practice.
Opus 4.5’s audit finds both the Schnorr and ECDSA signature creation functions to be constant time by way of STG/ASM analysis:
Function: sign_schnorr
STG Pattern: Uses mul internally
ASM Pattern: Inherits CT
Status: CT
────────────────────────────────────────
Function: sign_schnorr'
STG Pattern: Uses mul_wnaf internally
ASM Pattern: Inherits CT
Status: CT
────────────────────────────────────────
Function: sign_ecdsa
STG Pattern: Uses mul internally
ASM Pattern: Inherits CT
Status: CT
────────────────────────────────────────
Function: sign_ecdsa'
STG Pattern: Uses mul_wnaf internally
ASM Pattern: Inherits CT
Status: CT
Note that the apostrophe-suffixed versions use
'mul_wnaf' internally, requiring a precomputed context to
be passed. Signature verification functions are not required to
run in constant time, operating only on public inputs:
Function: verify_schnorr
Pattern: Uses mul_vartime for public point
Notes: Non-secret data
────────────────────────────────────────
Function: verify_schnorr'
Pattern: Uses mul_wnaf for G, mul_vartime for pubkey
Notes: Non-secret
────────────────────────────────────────
Function: verify_ecdsa
Pattern: Uses mul_vartime for public point
Notes: Non-secret data
────────────────────────────────────────
Function: verify_ecdsa'
Pattern: Uses mul_wnaf for G, mul_vartime for pubkey
Notes: Non-secret
The ECDH implementation is straightforward, its constant-time properties established primarily by the use of the CT scalar multiplication function:
ecdh
:: Projective -- ^ public key
-> Wider -- ^ secret key
-> Maybe BS.ByteString -- ^ shared secret
ecdh pub _SECRET = do
pt@(P _ _ (C.Montgomery -> z)) <- mul pub _SECRET
let !(Affine (C.retr -> x) _) = affine pt
!result = SHA256.hash (unroll32 x)
if CT.decide (C.eq z 0) then Nothing else Just result
It clearly has constant time semantics for nondegenerate inputs, and Opus 4.5 audits it as running in constant-time by way of STG and ASM analysis for good measure:
Function: ecdh
STG Pattern: Uses mul internally
ASM Pattern: Inherits CT
Status: CT
ppad-secp256k1 has a Criterion benchmark suite that produces
statistical execution time benchmarks on varying-sized inputs, such that
variation in wall-clock time should be readily attributable to noise.
For brevity, only derive_pub/wnaf, which is a synonym for
mul_wnaf, is shown of the elliptic curve primitives:
benchmarking derive_pub/wnaf, sk = 2
time 29.67 μs (29.64 μs .. 29.71 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 29.68 μs (29.64 μs .. 29.71 μs)
std dev 121.1 ns (83.80 ns .. 177.2 ns)
benchmarking derive_pub/wnaf, sk = 2 ^ 255 - 19
time 29.68 μs (29.65 μs .. 29.72 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 29.71 μs (29.68 μs .. 29.75 μs)
std dev 106.7 ns (71.74 ns .. 174.7 ns)
benchmarking schnorr/sign_schnorr' (small)
time 76.27 μs (76.21 μs .. 76.32 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 76.44 μs (76.40 μs .. 76.50 μs)
std dev 162.3 ns (123.1 ns .. 246.7 ns)
benchmarking schnorr/sign_schnorr' (large)
time 76.35 μs (76.31 μs .. 76.38 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 76.37 μs (76.35 μs .. 76.40 μs)
std dev 84.10 ns (67.03 ns .. 112.7 ns)
benchmarking ecdsa/sign_ecdsa' (small)
time 43.95 μs (43.89 μs .. 44.02 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 44.04 μs (43.96 μs .. 44.15 μs)
std dev 310.2 ns (259.9 ns .. 384.6 ns)
benchmarking ecdsa/sign_ecdsa' (large)
time 44.02 μs (44.00 μs .. 44.03 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 43.99 μs (43.97 μs .. 44.01 μs)
std dev 58.86 ns (46.66 ns .. 72.16 ns)
benchmarking ecdh/ecdh (small)
time 143.6 μs (143.4 μs .. 143.7 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 143.7 μs (143.3 μs .. 144.6 μs)
std dev 2.022 μs (846.9 ns .. 3.402 μs)
benchmarking ecdh/ecdh (large)
time 143.8 μs (143.7 μs .. 143.9 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 143.8 μs (143.7 μs .. 143.9 μs)
std dev 385.2 ns (265.9 ns .. 544.5 ns)
A ‘weigh’ benchmark suite similarly measures heap allocation performed by each function, and verifies that allocation is held constant across input sizes for all sensitive operations:
derive_pub
Case Allocated GCs
wnaf, sk = 2 312 0
wnaf, sk = 2 ^ 255 - 19 312 0
schnorr
Case Allocated GCs
sign_schnorr' (small) 14,416 0
sign_schnorr' (large) 14,416 0
ecdsa
Case Allocated GCs
sign_ecdsa' (small) 1,560 0
sign_ecdsa' (large) 1,560 0
ecdh
Case Allocated GCs
ecdh (small) 616 0
ecdh (large) 616 0
ppad-secp256k1’s test suite exercises the Schnorr signature implementation against the official BIP340 vectors, and the ECDSA and ECDH implementations against the relevant Wycheproof vectors (the former also being tested against noble-secp256k1’s vectors).
The implementations are highly-optimized, so the full test suite executes very quickly:
All 3757 tests passed (0.62s)
Test suite secp256k1-tests: PASS
By our appraisal, all sensitive functions in this library will run in constant time on aarch64-darwin when compiled with optimizations with GHC 9.10.3 and LLVM 19.1.7, unless marked otherwise.
The following prompt was used to have Opus 4.5 audit the library via a Claude Code slash command:
Perform a constant-time audit of the Haskell library at: $ARGUMENTS
## Steps
1. **Identify exports**: Find all exported functions from the
library's public modules.
2. **Build with diagnostics**:
- Enter a Nix shell for the target library by running `nix develop`
in its subdirectory.
- `cabal clean && cabal -f+llvm build
--ghc-options="-ddump-stg-final -ddump-llvm -ddump-to-file
-dsuppress-all -dno-suppress-type-signatures -dsuppress-uniques"`
- Locate the `.dump-stg-final` and `.dump-llvm` files in the
dist-newstyle subdirectory
3. **Generate assembly**:
- Strip GHC headers from the `.dump-llvm` files with e.g. `sed -i
's/^====.*$//g'`
- Run results through `llc -O3 -mcpu=native -filetype=asm`
4. **Analyze each exported function**:
**STG**: Check for input-dependent branching:
- Single-branch `case` = sequencing (OK)
- Multi-branch `case` on loop counter = OK
- Multi-branch `case` on input data = FAIL
- `let` bindings = allocation (note but OK if constant)
**ASM (aarch64)**: Flag variable-time patterns:
- `b.eq`, `b.ne`, `b.hs`, etc. on secret data = FAIL
- `cbz`, `cbnz`, `tbnz`, `tbz` on secret data = FAIL
- `adds`/`cset`/`orr`/`and`/`bic`/`sbfx` = OK (branchless)
**ASM (aarch64)**: Flag secret-dependent indexing patterns:
- e.g. `stp x25, x26, [sp, #40]` = OK (public stack pointer plus
fixed immediate offset)
- e.g. `ldr x0, [x1, x2, lsl #3]` = FAIL (if x2 is secret-derived,
implies cache timing leak)
5. **Generate report**: For each function, report CT status and any
issues found.
Functions suffixed `_vartime` are expected to be variable-time; note
but don't flag.