Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Info

Title of the paper: Building Secure SRAM PUF Key Generators on Resource Constrained Devices

Available at: https://ieeexplore.ieee.org/abstract/document/8730781

...

A securely maintained key is the premise upon which data stored and transmitted by ubiquitously deployed resource limited devices, such as those in the Internet of Things (IoT), are protected. However, many of these devices lack a secure non-volatile memory (NVM) for storing keys because of cost constraints. Silicon physical unclonable functions (PUFs) offering unique device specific secrets to electronic commodities are a low-cost alternative to secure NVM. As a physical hardware security primitive, reliability of a PUF is affected by thermal noise and changes in environmental conditions; consequently, PUF responses cannot be directly employed as cryptographic keys. A fuzzy extractor can turn noisy PUF responses into usable cryptographic keys. However, a fuzzy extractor is not immediately mountable on (highly) resource constrained devices due to its implementation overhead. We present a methodology for constructing a lightweight and secure PUF key generator for resource limited devices. In particular, we focus on PUFs constructed from pervasively embedded SRAM in modern microcontroller units and use a batteryless computational radio frequency identification (CRFID) device as a representative resource constrained IoT device in a case study.

Introduction

  • It has been shown that securing digital key storage is non-trivial in practice due to e.g., technical or cost limitations [1] → Key derivation with physical unclonable functions

  • PUFs can replace the functionality of the NVM based keys while increasing the security level of cryptographic key storage using standard CMOS technology based hardware security primitives. Overall, silicon PUFs offer:

    1. low fabrication costs [1]

    2. uniqueness (an inseparable fingerprint of a hardware instance)

    3. enhanced resistance to attacks, especially, invasive attacks [4]–[6]

  • a PUF based key generator utilizes a fuzzy extractor to turn unstable responses into a stable and uniformly distributed cryptographic key.

...

  • During key enrollment phase,helper data p is computed by using Gen(r), where p = r x HT and H is a
    parity check matrix of a linear error correction code. Key reconstruction described by Rep(r',p)—where r' is the reproduced response that may be slightly different from the enrolled response r—first constructs a syndrome s = (r' x HT) ⊕ p = e ⊕ HT, with e an error vector. Then through an error location algorithm, e is determined. Subsequently, the response r is recovered through r = e ⊕ r'.

  • The recovered PUF response r may not be uniformly distributed. Therefore, an entropy extraction step, using a universal hash function for example, is employed to generate a cryptographic key with full bit entropy.

  • → Although the entropy extractor realized with a hash function can be lightweight, the overhead of the secure sketch is comparatively more dominant.

Panel
panelIconId26a0
panelIcon:warning:
panelIconText⚠️
bgColor#E6FCFF

In a fuzzy extractor (FE) setting, the Gen() function is performed by the server during the enrollment phase to compute helper data while the Rep() function is implemented on the field deployed token.

By recognizing that the computational burden of the Rep() function is significantly more than that of Gen() function, Van Herrewege et al. [19] placed RepGen() on the resource-constraint token while leaving the computationally heavy GenRep() function execution to the resource-rich server; this arrangement is termed the reverse fuzzy extractor (RFE).

→ The authors will focus on the correct recovery in the context of resource limited devices and a RFE where the Gen() function is implemented on the resource limited token in the next sections.

Gen() function

Rep() function → heavier

fuzzy extractor (FE)

on server during enrollment

on resource-constrained device

reverse fuzzy extractor (RFE)

on resource-constrained device

on server

Reducing Reverse Fuzzy Extractor Overhead

...

  • majority voting: given q repeated response measurements under an operating condition, the response value (‘1’/‘0’) enrolled is that which shows no less than floor(q/2) measurements; with q an odd integer.

  • reliable response pre-selection: each PUF response under an operating condition is repeatedly measured i times, only response bits that exhibit 100% reliable regeneration (all ‘1’s/‘0’s) are selected for use and enrollment.

...

We can observe that when the difference between the reference operating condition at enrollment increases with respect to the condition during a regenerated response, the the expected BER increases.

(check the article for more details)

Although the server is unable to control the operating condition under which the response r' is regenerated, we can observe that the server now has the capability of choosing one of multiple reference responses to ensure that the chosen reference response is close to the operating condition under which the r' is regenerated.

...

For a single BCH(n, k, t) block, the complexity of finding the k-bit extracted key from the n-bit response r' is 2k. It is not common to use a single large BCH(n, k, t) block; typically a large block is split into small processing blocks to reduce implementation complexity. For k bits of key material, response r' can be divided into multiple non-overlapping blocks of a BCH(n1, k1, t1) code where n1 < n and k1 < k for a parallel implementation. Now the complexity of finding the k bit secret is 2k1L

(check the article for more details)

Experimental Validations

SRAM PUF Evaluation

...

  • Based on the hash function evaluations in [22], we selected BLAKE2s-128 as it showed the best performance, in term of clock cycles.

  • As for BCH(n1; k1; t1) code, it is imperative to use a small n1 and t1.

    • First, the BCH(n1; k1; t1) code overhead decreases sharply when the codeword length n1 decreases.

    • Second, a small t1 implies a correspondingly large k1. With a large k1, to achieve the security level of k-bit secret key sk, the needed number of BCH(n1; k1; t1) blocks—L = ciel(k/k1)—will reduce, thus, reducing the overall overhead to gain k-bit entropy for the secret key sk.

...

  1. Fuzzy Extractor with Single Reference Response:
    To achieve Pfail < 10-6, nine BCH(127,15,27) blocks are required giving Pfail = 1.66 x 10-7 when the single reference response under 25C is used. One BCH(127,15,27) block decoding consumes 2,102,222 clock cycles. The fuzzy extractor needs to sequentially execute BCH(127,15,27) decoding blocks 9 times and two BLACKE2s-128 hash operations.In total, it consumes up to 19,127,446 clock cycles.

  2. Reverse Fuzzy Extractor with Single Reference Response:
    To achieve Pfail < 10-6, nine BCH(127,15,27) blocks are required giving Pfail = 1.66 x 10-7 when the single reference response under 25C is used. Notably, the token only needs to perform the encoding operation requiring significantly less clock cycles in comparison with decoding. One BCH(127,15,27) block encoding consumes 111,335 clock cycles. This reverse fuzzy extractor needs to sequentially execute BCH(127,15,27) encoding 9 times and two BLACKE2s-128 hash operations. In total, it consumes up to 1,211,461 clock cycles.

  3. Reverse Fuzzy Extractor with Three Reference Response:
    To achieve Pfail < 10-6, eight smaller BCH(63,16,11) blocks are adequate to meet the requirement with Pfail = 9.85 x 10-7. Here, 3MRR under -15C, 25C, 80C are used. One BCH(63,16,11) block encoding consumes 51,003 clock cycles. This reverse fuzzy extractor needs to sequentially execute BCH(63,16,11) encoding 8 times and two BLACKE2s-128 hash operations. In total, it consumes up to 617,470 clock cycles.

The reverse fuzzy extractor overhead is only 6.3% of the fuzzy extractor when a single reference response is deployed to achieve the same key failure rate performance. When the MRR is adopted by the reverse fuzzy extractor, the clock cycle overhead is further reduced by 49% in comparison with a single reference response.

Security Discussion

(Important section: check the article for more details)

The bias of our tested SRAM PUF is 49.87%. Therefore, the extra entropy loss is very small.

...