Security Analysis, ppad-sha{256,512}.

Abstract

A formalization and analysis of the security model established by the ppad-sha256 and ppad-sha512 libraries.

Overview

ppad-sha256 and ppad-sha512 implement SHA-{256,512} and HMAC-SHA{256,512}, as specified by RFC 6234 and RFC 2104, for both strict and lazy bytestring. The implementations for strict bytestrings use primitive machine instructions from the ARM cryptographic extensions by default, if they’re available, and fall back to highly-optimised pure Haskell implementations otherwise; the variants for lazy bytestrings use only the pure implementations.

Threat Model

This analysis assumes an adversary capable of invoking library functions with chosen inputs an arbitrary number of times, and that is essentially it.

Security Property

The essential security property here is that the SHA2 and HMAC-SHA2 implementations are correct, as the only conceivable attack of interest would be for an attacker to somehow produce a desired incorrect digest or message authentication code (MAC) from a chosen input, based on a bug or flaw in the library itself.

Timing, allocation, or cache-related side-channels are not of concern when producing hash digests, as hash functions are intended to operate on variable-length inputs, which are also public and known to the attacker, and the SHA2 functions consist solely of atomic data-independent-time operations. For producing MACs, things are somewhat more subtle, in that the text is to be considered public, but the key is not. Variable-time execution cannot leak meaningful key material, however, as the key passes through several exclusive-or and SHA2 functions, each of which runs in constant time.

We also assume the security properties of the overall algorithms as given. So while SHA-{256,512} itself may be subject to length extension attacks, for example, we are only concerned with demonstrating that the implementation of SHA-{256,512} found in the libraries is correct.

The implementations should thus:

Analysis

The analysis performed here is specifically for ppad-sha256 v0.3.1 and ppad-sha512 v0.2.1. Analysis is restricted to the source code level, as seems sufficient for demonstration of the correctness properties.

Memory Safety

ppad-sha256 and ppad-sha512 use a number of internal and unsafe operations for performance, e.g. bytestring’s ‘unsafeDrop’, ‘unsafeIndex’, ‘unsafeCreate’, and ‘unsafeForeignPtr’ functions, the ‘unsafeShiftR’ function from Data.Bits, and so on.

These are all used in simple cases for which safety can be deduced by pure source analysis. Previous versions of the libraries included explicit comment-level proofs of safety, e.g.:

(ppad-sha256, v0.2.4)

  hash :: BS.ByteString -> BS.ByteString
  hash bs = cat (go iv (pad bs)) where
    -- proof that 'go' always terminates safely:
    --
    -- let b = pad bs
    -- then length(b) = n * 512 bits for some n >= 0                (1)
    go :: Registers -> BS.ByteString -> Registers
    go !acc b
      -- if n == 0, then 'go' terminates safely                     (2)
      | BS.null b = acc
      -- if n > 0, then
      --
      -- let (c, r) = unsafe_splitAt 64 b
      -- then length(c) == 512 bits                                 by (1)
      --      length(r) == m * 512 bits for some m >= 0             by (1)
      --
      -- note 'unsafe_hash_alg' terminates safely for bytestring    (3)
      -- input of exactly 512 bits in length
      --
      -- length(c) == 512
      --   => 'unsafe_hash_alg' terminates safely                   by (3)
      --   => 'go' terminates safely                                (4)
      -- length(r) == m * 512 bits for m >= 0
      --   => next invocation of 'go' terminates safely             by (2), (4)
      --
      -- then by induction, 'go' always terminates safely           (QED)
      | otherwise = case unsafe_splitAt 64 b of
          SSPair c r -> go (unsafe_hash_alg acc c) r

This works, but requires manual proof verification, which is tedious and error prone, especially when the code changes.

Deductive, logical proofs of safety can be replaced by inferential arguments in order to better-automate this sort of thing. A robust mechanism for automatically producing arguments of this sort is property or fuzz testing, which attempts to refute specified safety properties by testing library functions on thousands of random inputs. Such QuickCheck-based property tests were added to each of ppad-sha256 and ppad-sha512 in the latest versions, testing that, for random inputs:

Outputs are of the form (from the ppad-sha512 suite):

  properties
    hash == hash_lazy:            OK
      +++ OK, passed 1000 tests.
    hmac == hmac_lazy:            OK
      +++ OK, passed 1000 tests.
    hash output is 64 bytes:      OK
      +++ OK, passed 1000 tests.
    hmac output is 64 bytes:      OK
      +++ OK, passed 1000 tests.
    hash_lazy chunking-invariant: OK
      +++ OK, passed 1000 tests.
    hmac_lazy chunking-invariant: OK
      +++ OK, passed 1000 tests.

Haskell’s foreign function interface (FFI) is also used to access native ARM instructions for SHA2 computation if they’re available at runtime. This involves using C shims for the SHA2 block functions that use ARM NEON intrinsics and importing them via FFI, like so:

  foreign import ccall unsafe "sha256_block_arm"
    c_sha256_block :: Ptr Word32 -> Ptr Word8 -> IO ()

  foreign import ccall unsafe "sha256_arm_available"
    c_sha256_arm_available :: IO Int

and then wrapping them in ‘unsafePerformIO’ calls, which is safe, as the imported functions are pure.

By construction, the property tests also assert safety of the FFI code when run on an aarch64 machine supporting the +sha2 and +sha3 cryptographic extensions, but the C code itself can also be productively analysed by ASan (i.e. ‘AddressSanitizer’) and UBSan (‘UndefinedBehaviorSanitizer’), which check for buffer overflows, pointer arithmetic errors, and undefined behaviour in usage of the ARM intrinsics at runtime.

Invoking the libraries’ test suites with the ‘sanitize’ flag runs them with ASan and UBSan enabled, and they pass as expected. Introducing an intentional error in a C shim, for example, produces:

  cbits/sha256_arm.c:37:29: runtime error: index 64 out of bounds for type 'const uint32_t[64]'
  SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior cbits/sha256_arm.c:37:29

which helps assure that the sanitizers are running properly and can catch the desired issues, should any be present.

Correctness

In the same manner that property tests are used to assert safety of the implementations, unit tests on standard, high-quality vectors are used to assert their correctness.

The SHA-{256,512} functions are validated against various vectors derived from the NIST SHAVS (retrieved from here), and the HMAC-SHA{256,512} functions are validated against both RFC 4231 vectors and Project Wycheproof vectors (198 in total).

Supplemental Note on Using Produced MACs

The production of message authentication codes by the HMAC-SHA{256,512} functions is not subject to concerns around timing-related side channel attacks, and so we’re not concerned about verifying those sorts of properties here. However, the use of MACs must be performed carefully, as comparing MACs in variable-time can leak information about secrets.

As a sort of safeguard for downstream users, the HMAC-SHA{256,512} functions return message authentication codes in a special type ‘MAC’, which is a simple newtype wrapper over a strict ByteString. This MAC type is supplied with a constant-time ‘Eq’ instance, such that MACs produced by the functions are, by default, comparable in constant time.

Conclusion

By our appraisal, ppad-sha256 and ppad-sha512 correctly implement their respective SHA2 and HMAC-SHA2 functions, per the corresponding specifications.