Marco Peereboom

Dedup and Crypto sitting in a tree!


Who are you?

What is dedup?

Deduplication types

Better example

Better example continued

Why dedup?

We are bored to tears with all this storage stuff! What does this have to do with security?


What is crypto?

dd if=marco.bmp of=marcohdr.bmp bs=1 count=54
dd if=marco.bmp of=marco.raw bs=1 skip=54
openssl aes-256-cbc -salt -a -e -in marco.raw -out marco.enc
cat marco.enc >> marcohdr.bmp

Why is crypto hard?

Why is crypto and dedup hard?

Dedup vs. Features

Dedup/Feature Source Post-process In-line
Bandwidth Low High High
Privacy High Low Low
Storage Medium Low Medium
Client Resources Medium Low Low
Server Resources Low High Medium

Design goals





Low bandwidth


Putting it all together

Ok, now what??

Safely Generating Collisions


Retrieving Chunks

1000 words

Hash Collisions

* See section 3.1 of

Vroom Vroom!

Other aspects


  1. Create a secrets passphrase key (Ks) with PBKDF2 using a random 1024-bit salt (S), a user supplied secrets passphrase (Ps), and the round count (N) - Ks = PBKDF2(Ps, S, N)
  2. Create three random 256-bit keys using arc4random which are used as a mask key (Km), chunk key (K1), and tweak key (K2)
  3. Encrypt the chunk key (K1) and tweak key (K2) using AES-ECB-256 and the mask key (Km) - e_aeskey = AES-ECB-256(K1); e_ivkey = AES-ECB-256(K2)
  4. Encrypt mask key (Km) using AES-ECB-256 and the secrets passphrase key (Ks) obtained in step 1 - e_maskkey = AES-ECB-256(Km)
  5. HMAC_SHA256 the unencrypted mask key (Km) obtained in step 2 and store the resulting digest - hmac_maskkey = HMAC_SHA256(Km)
  6. HMAC_SHA512 all the inputs and store that digest for integrity reasons - digest = HMAC_SHA512(inputs)

Secrets continued

1. Validate the integrity of the secrets by taking the HMAC_SHA512 of all stored encrypted inputs and comparing to stored digest

2. Derive the secrets passphrase key (Ks) using the salt (S), user supplied secrets passphrase (Ps), and round count (N)

3. Decrypt the encrypted mask key (e_maskkey) using the secrets passphrase key (Ks) to obtain the mask key (Km)

4. HMAC SHA256 the decrypted mask key (Km) - hmac_maskkey = HMAC_SHA256(Km)

5. Compare the digest obtained in step 4 (hmac_maskkey) to the stored digest and, if they differ, abort the secrets file decryption process

6. Decrypt the encrypted tweak key (e_ivkey) using the mask key (Km) obtained in step 3 to obtain the tweak key (K1)

7. Decrypt the encrypted chunk key (e_aeskey) using the mask key (Km) obtained in step 3 to obtain the chunk key (K2)

Secrets visual

version = 00000001
rounds = 0003e800
salt = 2e4cb35773172e8222dbf5103f4822e9c82f1ee5cca34cbf4b08aec0b909cc618152fd68b1406f1684d59b0bb4efda2d2136750b67697a7d1bc898d073e614180fb44727ce83dcacd8d68f2a8d663180d59df2333fca97616164e6943447975bca4cce154ae3b5dd9b954c3bd3a7c795301ec7489ee35da0c38675aaedadd7e4
e_aeskey = 00bbd6e857a12f9a11cb5825ea420b45afd1ab92f409404068abddd5666853a786ebd7e91ff604c7ac69ad695a77a1fb
e_ivkey = 7650f458f6b1bd465ce6693857b4180cd3e76281afafca3c2db1e7eed3173d4086ebd7e91ff604c7ac69ad695a77a1fb
e_maskkey = d8857b3ae9b439fd9d93e426cf28f887b9a4da871b279817caad5a6de55421aa99f568d6efcdb3b1c57ee4399c0b5607
hmac_maskkey = bed32dc7ab984e94244388a0c5b2ad48ab427133a6a0f0cd08d5545d1ba8f48f
digest = dae03eebf7810885c3779c864f95aa79650470eb8d3b4048178193c3f578e107d4dddebf464a58f21eb073a05249e0d414d78f11b19868536f4449a970cc9492



Where are you at?