Cryptopals, Set 1

Last updated: Feb 1, 2025.

This post contains my solutions to the cryptopals cryptography challenges. Most of the time I’ve optimized the Clojure code for readability and not performance. These challenges are a lot of fun and often rely on coming up with the right clever ideas, so I highly encourage you to try them yourself first if you want to avoid spoilers.

(ns set1
  (:require [clojure.data.codec.base64 :as b64]))

We’ll write a little utility function to verify answers:

(defn verify
  "Takes a boolean or sequence of booleans and returns emoji checkmarks or crosses"
  [input]
  (let [to-emoji #(if % "✅" "❌")]
    (if (sequential? input)
      (str/join " " (map to-emoji input))
      (to-emoji input))))

Challenge 1

Convert a hex string to its base64 equivalent.

The key thing to understand here are the following equivalencies:

$6$ hex chars = $3$ bytes = $24$ bits = $4$ base64 chars.

(defn hex->bytes
  "Convert a hex string to a byte array"
  [hex-string]
  (let [hex-pairs (re-seq #".{2}" hex-string)]
    (byte-array (map #(Integer/parseInt % 16) hex-pairs))))

(defn hex->base64
  [input]
  (-> input
      hex->bytes
      b64/encode
      String.))
      
(defn solve-ch1
  []
  (let [CH1-INPUT "49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6d"
        CH1-ANSWER "SSdtIGtpbGxpbmcgeW91ciBicmFpbiBsaWtlIGEgcG9pc29ub3VzIG11c2hyb29t"]

    (verify (= CH1-ANSWER
               (hex->base64 CH1-INPUT)))))

(solve-ch1) => ✅

Challenge 2

Write a function that takes two equal-length buffers and produces their XOR combination.

(defn bytes->hex
  "Convert a byte array to a hex string"
  [input]
  (apply str (map #(format "%02x" %1) input)))
  
(defn solve-ch2
  []
  (let [CH2-INPUT1 "1c0111001f010100061a024b53535009181c"
        CH2-INPUT2 "686974207468652062756c6c277320657965"
        CH2-ANSWER "746865206b696420646f6e277420706c6179"]

    (verify (= CH2-ANSWER
               (bytes->hex (map bit-xor
                                (hex->bytes CH2-INPUT1)
                                (hex->bytes CH2-INPUT2)))))))

(solve-ch2) => ✅

Challenge 3

The hex encoded string [input] has been XOR’d against a single character. Find the key, decrypt the message.

You can do this by hand. But don’t: write code to do it for you.

How? Devise some method for “scoring” a piece of English plaintext. Character frequency is a good metric. Evaluate each output and choose the one with the best score.

(def CH3-CIPHER
  (hex->bytes
   "1b37373331363f78151b7f2b783431333d78397828372d363c78373e783a393b3736"))

At first I thought of using something like the KL divergence (see this post) to score the plaintext since I’ve always read that one of the ways to break simple ciphers is “frequency analysis”. But these challenges are about doing whatever it takes to break the crypto, not about following textbooks. So I tried to formalize what I’d do if I was breaking this “by hand”.

If you decrypt the input with different keys, most of the results don’t look anything like English. They look like gibberish. Solving this challenge is therefore just picking the key that results in the least-gibberish plaintext.

We’ll define the gibberish score of a piece of text as the ratio of “non-readable” ASCII bytes to the total number of bytes in the string. A readable ASCII byte is either in the range A-Z, or in the range a-z or a space. A non-readable byte is the negation of that.

(defn non-readable [b]
  (not (or (and (>= b (byte \A)) (<= b (byte \Z)))
           (and (>= b (byte \a)) (<= b (byte \z)))
           (= b (byte \space)))))

The gibberish score:

(defn gibberish-score
  [^bytes text]
  (/ (double (count (filter non-readable text)))
     (count text)))

(defn gibberish-score-str
  [text]
  (gibberish-score (byte-array (map byte text))))

(gibberish-score-str “ABCD?”) => 0.2 (gibberish-score-str “@&#$&@#”) => 1.0

Now we need a function that takes a list of possible keys and computes the gibberish score and the plaintext for each key.

(defn xor-trial-keys-scores
  [ciphertext trial-keys]
  
  (for [key trial-keys
        :let [plain (byte-array (map #(bit-xor key %) ciphertext))]]

    {:key (char key)
     :score (gibberish-score plain)
     :plain (String. plain)}))

Let’s try it on a dummy input:

(xor-trial-keys-scores (hex->bytes “1b3f89af9068”) [(byte \B)]) ({:key \B, :score 0.8333333333333334, :plain “Y}��*“})

To solve the challenge we pick the key with the smallest value of :score.

(defn solve-ch3
  [trial-keys]

  (let [CH3-CIPHER (hex->bytes "1b37373331363f78151b7f2b783431333d78397828372d363c78373e783a393b3736")]
    (apply min-key :score
         (xor-trial-keys-scores CH3-CIPHER trial-keys))))

What range of keys should we try? We can just try all possible byte values $0...255$.

(solve-ch3 (range 0 255)) {:key \X, :score 0.029411764705882353, :plain “Cooking MC’s like a pound of bacon”}

Challenge 4

One of the 60-character strings in this file has been encrypted by single-character XOR. Find it.

To solve this we just need to repeat the process from challenge 3 on every line from the file and hopefully only one of them will yield an English-looking plaintext.

First we write a helper function that returns the hex-encoded lines of a file as a sequence of byte arrays.

(defn read-hex-lines [file-path]
  (with-open [rdr (io/reader file-path)]
    (doall (map hex->bytes (line-seq rdr)))))

For each line in the file we’ll try the keys in the range $0...255$ and pick the key that yields the smallest gibberish score. We’ll then pick the line from the file whose score is the lowest.

(defn solve-ch4
  []
  (let [ciphers (read-hex-lines "src/data/4.txt")
      trial-keys (range 0 255)
      best-scores (map #(apply min-key :score
                               (xor-trial-keys-scores % trial-keys))
                       ciphers)]
  (apply min-key :score best-scores)))

(solve-ch4) {:key \5, :score 0.03333333333333333, :plain “Now that the party is jumping”}

Challenge 5

Here is the opening stanza of an important work of the English language [lyrics to “Ice Ice Baby”]

In repeating-key XOR, you’ll sequentially apply each byte of the key; the first byte of plaintext will be XOR’d against I, the next C, the next E, then I again for the 4th byte, and so on.

It should come out to: [output]

Encrypt a bunch of stuff using your repeating-key XOR function. Encrypt your mail. Encrypt your password file. Your .sig file. Get a feel for it. I promise, we aren’t wasting your time with this.

Implementing this is straightforward.

(defn repeating-key-xor [text key]
  (map bit-xor
       (map byte text)
       (cycle (map byte key))))

After encrypting a bunch of stuff as suggested, the only thing I notice is that some sequences of bytes seem to be repeated often.

(defn solve-ch5
  []
  (let [CH5-CIPHER "Burning 'em, if you ain't quick and nimble\nI go crazy when I hear a cymbal"
        CH5-ANSWER "0b3637272a2b2e63622c2e69692a23693a2a3c6324202d623d63343c2a26226324272765272a282b2f20430a652e2c652a3124333a653e2b2027630c692b20283165286326302e27282f"]

  (verify (= CH5-ANSWER
             (bytes->hex (repeating-key-xor CH5-CIPHER "ICE"))))))

(solve-ch5) “✅”

Challenge 6

There’s a file here. It’s been base64’d after being encrypted with repeating-key XOR. Decrypt it.

We’re going to follow the steps described on the challenge page.

The Hamming distance (edit distance) between two strings is the number of bits that are different. For two bytes, this is the number of 1 bits after XOR’ing them together. Summing this across all pairs of bytes gives us the edit distance.

(defn hamming-distance [s1 s2]
  (reduce + (map #(Integer/bitCount (bit-xor %1 %2)) s1 s2)))

  (hamming-distance (.getBytes "this is a test")
                    (.getBytes "wokka wokka!!!"))

TODO: explain better

(def TRIAL-KEYSIZES (range 2 40))

(def distances (into (sorted-map)
                     (map (fn [size]
                            (vector (/ (hamming-distance
                                        (take size CH6-CIPHER)
                                        (take size (drop size CH6-CIPHER)))
                                       (double size))
                                    size))
                          TRIAL-KEYSIZES)))

Transposing blocks

(defn block-transpose
  [keysize input]
  (for [i (range keysize)]
    (take-nth keysize (drop i input))))

(apply str (flatten (block-transpose 3 "abcdefghijklmnopqrstuvwxyz")))

“adgjmpsvybehknqtwzcfilorux”