Static Constant-Time Assembly Analysis

The security analyses for ppad-fixed and ppad-secp256k1 defended the constant-time security properties of those two libraries, seeking to rule out that any variable timing-based exploits were present in the ultimate compile targets. This was demonstrated by analysis of data representation & algorithm choices, as well as STG IR & assembly examination.

The assembly analysis in particular relied on manual and LLM-assisted examination of the LLVM backend-produced assembly; as I described in the ppad-secp256k1 analysis, this makes for a good “particularly rigorous sanity check,” but it doesn’t constitute an exhaustive, deterministic, mechanical proof. Though LLMs are capable of analyzing large amounts of complicated structured data, I don’t trust them to pick up every last detail in hundreds of thousands of lines of dense assembly.

One wants to sleep soundly at night, knowing his stuff is not exploitable by a determined adversary, at least in commonly-vulnerable ways. So to construct something closer to the aforementioned proofs, I developed a tool, ppad-auditor, for static constant-time analysis of GHC-produced ARM (aarch64) assembly, produced via either the LLVM or NCG (i.e. native code generator) backends.

ppad-auditor works with the compiled assembly in blocks, scanning for the presence of data-dependent-time instructions while filtering out fixed, known GHC runtime patterns that would otherwise introduce substantial false-positive noise. It still does not construct proofs, but it makes assembly analysis much more surgical and focused. The result is the ability to very reliably audit GHC-produced assembly for the presence of non-constant-time patterns (and thus to sleep easier at night).

Design

The basic idea, as was done in the ppad-{fixed,secp256k1} security analyses, is to flag instructions or patterns that are known to be non-constant-time, e.g. conditional or indirect data-dependent branches, division instructions, and loads & stores from/to data-dependent addresses. So, to reuse an example from the ppad-fixed analysis, the following assembly should be flagged due to the presence of the conditional branch (b.ne) instructions:

  ; %bb.0:
    cmp x8, x9
    b.ne  LBB31_4
  ; %bb.1:
    cmp x8, x9
    b.ne  LBB31_4
  ; %bb.2:
    cmp x8, x9
    b.ne  LBB31_4
  ; %bb.3:
    add x20, x8, #40

Similarly, the following “store pair” instruction should be fine, since it stores its arguments to a public stack pointer plus public offset:

  stp x25, x26, [sp, #40]

.. but the following should be flagged, since the store address depends on a register with potentially unknown provenance:

  stp x25, x26, [x1, #16]

In many cases instructions like this are actually benign, but require additional knowledge of GHC’s calling convention and runtime patterns to filter out. The ppad-fixed analysis also provided this example of loading from an address that depends on the ‘x20’ register:

    ldp x8, x9, [x20]

While this generally fits the “unsafe” pattern, this particular addressing scheme is safe, as the runtime uses ‘x20’, which is part of GHC’s calling convention, to track STG stack pointers. There are other similar patterns as well: tag and arity checks, closure entry, CAF (constant applicative form) initialization, and so on. ppad-auditor encodes knowledge of these patterns, which can vary between LLVM and NCG backend-produced assembly, and filters them out of the flagged result set.

Static analysis of ppad-fixed’s assembly

Running ppad-auditor on the 51k-line LLVM-backend-produced assembly of ppad-fixed’s Montgomery ‘Curve’ module yields the following:

  $ cabal run auditor -- --scan-nct -i etc/Curve.s
  __blk_cFhR$def: 1
    48086: cond-branch: Cbz X8 "LBB1650_2"
  _cBhg_info$def: 1
    46987: cond-branch: BCond "ne" "LBB1617_2"
  _cF0U_info$def: 1
    47226: cond-branch: Tbnz W8 0 "LBB1625_2"
  _crHv_info$def: 3
    3944: cond-branch: BCond "ne" "LBB86_4"
    3950: cond-branch: BCond "ne" "LBB86_4"
    3956: cond-branch: BCond "ne" "LBB86_4"

So, six flagged instructions in total. Each of these can then be analysed with reference to the source code itself; Opus 4.5 identifies them as follows (Opus’s analysis):

  1. _crHv_info$def (3 findings at lines 3944, 3950, 3956)

  b.ne LBB86_4           ; branch if not equal
  ; ... repeated for remaining limbs

  This is eq_vartime - comparing two 256-bit Montgomery values
  limb-by-limb. Each b.ne exits early if any limb differs (short-circuit
  equality), which is the variable-time behavior.

  2. _cBhg_info$def (line 46987)

  b.ne LBB1617_2         ; branch based on equality check result

  This is the verification step within sqrt# - comparing r² == input
  via XOR/OR of all 4 limbs to determine if the computed square root is
  valid. This comparison produces the Choice value that sqrt_vartime
  later uses.

  3. _cF0U_info$def (line 47226)

  tbnz w8, #0, LBB1625_2 ; test bit 0, branch if set

  This is odd_vartime - checking the least significant bit and
  returning True/False. The tbnz (test bit and branch if non-zero) is the
  variable-time operation.

  4. __blk_cFhR$def (line 48086)

  cbz x8, LBB1650_2      ; branch if Choice == 0 (sqrt failed)

  This is part of sqrt_vartime - checking the C.Choice success flag
  returned by sqrt#. The function returns (# Limb4, C.Choice #) where
  the Choice indicates whether a square root exists. The cbz branches on
  this unboxed discriminant.

We can check this by calling ppad-auditor with the specific symbols corresponding to 'eq_vartime', 'sqrt_vartime', and so on, to see all instructions reachable from the symbol root. For example:

  $ eqvartime='_ppadzmfixedzm0zi1zi3zminplace_NumericziMontgomeryziSecp256k1ziCurve_eqzuvartime_info$def'
  $ cabal run auditor -- -i etc/Curve.s --symbol $eqvartime --scan-nct
  _crHv_info$def: 3
    3944: cond-branch: BCond "ne" "LBB86_4"
    3950: cond-branch: BCond "ne" "LBB86_4"
    3956: cond-branch: BCond "ne" "LBB86_4"

  Root symbol:       _ppadzmfixedzm0zi1zi3zminplace_NumericziMontgomeryziSecp256k1ziCurve_eqzuvartime_info$def
  Reachable symbols: 3
  With findings:     1
  NCT findings:      3

This confirms Opus’s diagnosis; '_crHv_info$def' is the only symbol with flagged instructions that is reachable from the one corresponding to 'eq_vartime'.

The NCG-produced assembly is more human-readable; ppad-auditor produces the following results from it:

  $ cabal run auditor -- --scan-nct -i etc/CurveNCG.s
  LnpoA: 1
    36064: cond-branch: Cbnz X4 "LcpnU"
  _Numeric.Montgomery.Secp256k1.Curve.$wsqrt_vartime_info: 1
    34883: cond-branch: BCond "ne" "Lcp81"
  _Numeric.Montgomery.Secp256k1.Curve.eq_vartime_info: 3
    6065: cond-branch: BCond "ne" "Lcmdy"
    6070: cond-branch: BCond "ne" "Lcmdy"
    6075: cond-branch: BCond "ne" "Lcmdy"
  _Numeric.Montgomery.Secp256k1.Curve.odd_vartime_info: 1
    35076: cond-branch: Cbnz X17 "LcphH"

Three of the four symbols identified as using non-constant-time instructions are clearly the various vartime functions. The last, ‘LnpoA’, can be traced to the ‘case n of..’ conditional branch in this exponentiation loop:

  exp# b e =
    let !o = L4 0x1000003D1## 0## 0## 0##
        loop !r !m !ex n = case n of
          0 -> r
          _ ->
            let !(# ne, bit #) = WW.shr1_c# ex
                !candidate = mul# r m
                !nr = select# r candidate bit
                !nm = sqr# m
            in  loop nr nm ne (n - 1)
    in  loop o b e (256 :: Word)

This doesn’t appear in the LLVM backend-produced assembly; LLVM unrolled the loop here, eliminating it entirely.

Static analysis of ppad-secp256k1’s assembly

ppad-secp256k1’s NCG-produced assembly is several times larger than that of the Montgomery ‘Curve’ module from ppad-fixed. It’s about 215k lines, and contains many more intentionally variable-time patterns, mostly around parsing functionality and bytestring manipulation. ppad-auditor flagged 212 instructions on its run, highlighting a lot of stuff like this:

  _$w$cshowsPrec1_info: 1
    5941: cond-branch: BCond "lt" "Lcta2"
  _$w$cshowsPrec_info: 1
    6734: cond-branch: BCond "lt" "LctfP"
  _$w_precompute_info: 3
    88941: cond-branch: Cbnz X23 "LcG8V"
    88988: cond-branch: BCond "lt" "LcG91"
    89001: div: Sdiv X15 X15 X23
  _$wparse_int256_info: 1
    9538: cond-branch: BCond "ne" "LctNL"
  _$wparse_point_info: 13
    168044: cond-branch: BCond "lt" "LuSWq"
    168048: cond-branch: BCond "lt" "LcSEh"
    168052: cond-branch: BCond "ne" "LcSEf"
    168058: cond-branch: BCond "ne" "LcSEf"
    168508: cond-branch: Cbnz X7 "LcSCK"
    168811: cond-branch: Cbnz X17 "LuSWt"
    168818: cond-branch: BCond "ne" "LcSO4"
    168842: cond-branch: BCond "ne" "LuSWx"
    168860: cond-branch: BCond "lt" "LcSEf"
    168874: cond-branch: BCond "eq" "LsoYe"
    168878: cond-branch: BCond "ne" "LcSEf"
    169743: cond-branch: Cbnz X12 "LcSUv"
    170268: cond-branch: Cbnz X17 "LcSUg"
  _$wparse_sig_info: 1
    9343: cond-branch: BCond "ne" "LctM2"

This is all expected to be variable time, so there is no issue here. But it also flags e.g. the following:

  _$w_sign_ecdsa_no_hash'_slow: 4
  71116: cond-branch: BCond "ne" "Lsif5"
  71120: cond-branch: BCond "ne" "Lsif5"
  71124: cond-branch: BCond "ne" "Lsif5"
  71128: cond-branch: BCond "ne" "Lsif5"
  _$wecdh_info: 2
  176128: cond-branch: BCond "eq" "LcU3k"
  176161: cond-branch: BCond "eq" "LcU7e"
  _$wmul_info: 1
  101961: cond-branch: BCond "ne" "LcJgl"
  _$wmul_wnaf_info: 6
  34495: cond-branch: BCond "lt" "Lcx6P"
  34500: cond-branch: BCond "lt" "Lcx6P"
  34502: cond-branch: Cbnz X23 "Lcx7c"
  34537: div: Sdiv X17 X17 X23
  34724: cond-branch: BCond "ne" "Lcx8J"
  34954: cond-branch: BCond "ne" "Lcy90"
  _$wsign_ecdsa'_slow: 4
  48319: cond-branch: BCond "ne" "LsgzH"
  48323: cond-branch: BCond "ne" "LsgzH"
  48327: cond-branch: BCond "ne" "LsgzH"
  48331: cond-branch: BCond "ne" "LsgzH"
  _$wsign_schnorr'_info: 8
  206710: cond-branch: Cbnz X17 "LcZcf"
  206730: cond-branch: Cbnz X15 "LcZcc"
  206900: cond-branch: BCond "gt" "LcZby"
  206906: cond-branch: BCond "ge" "LcYWj"
  206958: cond-branch: BCond "lt" "LcZbs"
  207680: cond-branch: Cbnz X8 "LcZbh"
  207687: cond-branch: Cbnz X13 "LcZb5"
  207699: cond-branch: Cbnz X14 "LcZaT"

All these symbols are things that are supposed to run in constant time, so the flagged instructions here warrant examination. They ultimately correspond only to intentionally-variable-time code paths, for example of negligible-probability branches (e.g. in RFC6979 nonce generation) or those involving invalid inputs (e.g. handling a “secret key” that’s outside the secp256k1 scalar group).

The LLVM backend-generated assembly is similar in size, about 225k lines; ppad-auditor flagged 212 instructions on its pass over that. The patterns are identical, but are harder to analyze at a glance, as this assembly contains much more indirection via tiny continuations with unrecognizable names. Again, we can zoom in on a particular symbol of interest, e.g.:

$ schnorr='_ppadzmsecp256k1zm0zi5zi4zminplace_CryptoziCurveziSecp256k1_signzuschnorrzq_info$def'
$ cabal run auditor -- -i etc/Secp256k1_HEAD.s --scan-nct --symbol $schnorr
_cEb9_info$def: 1
  32646: cond-branch: BCond "eq" "LBB858_2"
_cEdM_info$def: 1
  32897: cond-branch: BCond "ne" "LBB861_4"
_ppadzmsecp256k1zm0zi5zi4zminplace_CryptoziCurveziSecp256k1_zdwmulzuwnaf_info$def: 4
  32498: cond-branch: Tbnz X23 63 "LBB857_7"
  32502: cond-branch: BCond "mi" "LBB857_7"
  32504: cond-branch: Cbz X23 "LBB857_8"
  32520: div: Sdiv X10 X10 X23

Root symbol:       _ppadzmsecp256k1zm0zi5zi4zminplace_CryptoziCurveziSecp256k1_signzuschnorrzq_info$def
Reachable symbols: 9
With findings:     3
NCT findings:      6

Examining the flagged instructions, '_cEb9_info$def' can be traced to the variable-time equality check on a negligible-probability outcome, '_cEdM_info$def' the fixed-size loop termination check in wNAF scalar multiplication, and the remaining instructions under 'mul_wnaf' itself are all checks, or a division, on the public wNAF window width parameter. So, all good, as expected.

Conclusion

This sort of analysis is obviously dependent on particular source and tool versions. The above was performed using GHC 9.10.3 and LLVM 19.1.7, both from Nixpkgs, using the same optimization flags as were given in the ppad-{fixed,secp256k1} security analyses, for particular commits of the respective libraries (v0.1.3 for ppad-fixed; the ppad-secp256k1 source is slightly ahead of v0.5.4).

The beauty of this tool is that it is fast and reusable, making more rigorous constant-time analysis of assembly artifacts much simpler going forward. I’ve also done some work on implementing a so-called “taint analysis” mode, by which one can prove that manually-seeded secrets do not flow into (i.e., taint) unexpected registers. This would be a very valuable addition.

None of this stuff is necessarily specific to Haskell or GHC targets; the same techniques, adapted to whichever runtime or calling convention, can really work on any instruction set one is interested in.

If this is the sort of assurance you think you could profit from (perhaps you also like to sleep soundly at night), feel free to get in touch.

(/)

(/about)

(/articles)

(/email)

(/git)

(/docs)