Fixed-length MACs

cr0mll · Sep 5, 2023 · 1ff1dca · 1ff1dca
1 parent 107cc72
commit 1ff1dca
Show file tree

Hide file tree

Showing 229 changed files with 1,146 additions and 281 deletions.
diff --git a/Notes/Cryptography/Private-Key Cryptography/Message Authentication Codes (MACs).md b/Notes/Cryptography/Private-Key Cryptography/Message Authentication Codes (MACs).md
@@ -40,18 +40,32 @@ Just how the two communicating parties exchange a particular secret key without
 It is now time to describe what it means for a MAC system to be secure. As it turns out, the most pertinent threat model for MACs is a [chosen-message attack](index.md). The adversary has access to some messages and their corresponding tags and they are even free to choose the messages to be signed. The adversary's goal is to then find an entirely new *valid* message-tag pair without any knowledge of the secret key.
 
 ```admonish danger title="Definition: CMA-Security for Message Authentication Codes"
-A MAC system $(\textit{Sign}, \textit{Verify})$ is *CMA-secure* if for any set of message-tag pairs $(m_1, \tau_1), (m_2,\tau_2), ..., (m_q, \tau_q)$ that were signed with the same key $k \leftarrow_R \mathcal{K}$ and for every efficient adversary $\textit{Eve}$ which has access to those message-tag pairs, the probability that $\textit{Eve}$ can produce a new valid message-tag pair $(m, \tau)$, called an *existential forgery*, is negligble, i.e.
+A MAC system $(\textit{Sign}, \textit{Verify})$ is *CMA-secure* if for every efficient adversary $\textit{Eve}$ and any set of message-tag pairs $(m_1, \tau_1), (m_2,\tau_2), ..., (m_q, \tau_q)$ whose messages were selected by $\textit{Eve}$ and were signed with the same key $k \leftarrow_R \{0,1\}^n$ to obtain their corresponding tags, the probability that $\textit{Eve}$ can produce a new valid message-tag pair $(m, \tau)$, called an *existential forgery*, when given $(m_1, \tau_1), (m_2,\tau_2), ..., (m_q, \tau_q)$, is at most $\frac{1}{|\mathcal{K}|} + \epsilon(n)$ for some negligible $\epsilon$, i.e.
 
-$$\Pr_{k \leftarrow_R \mathcal{K}}[\textit{Verify}(k, m, \tau) = 1] \le \epsilon$$
+$$\Pr_{k \leftarrow_R \mathcal{K}}[\textit{Verify}(k, m, \tau) = 1] \le \frac{1}{2^n} + \epsilon(n)$$.
+```
+
+```admonish tip title="Definition Breakdown"
+The adversary $\textit{Eve}$ is free to choose the messages $m_1,m_2,...,m_q$ and is then presented with their tags $t_1, t_2, ..., t_q$ which are signed with the secret key $k$, i.e. $\tau_i \leftarrow \textit{Sign}(k, m_i)$. The attacker then produces a new candidate pair $(m, \tau)$, called an *existential forgery*, with the goal that this pair fools $\textit{Verify}$ when checked with the secret key $k$. The MAC system is secure if the existential forgery can fool $\textit{Verify}$ with only an extremely small advantage over $\frac{1}{2^n}$. The reason for $\frac{1}{2^n}$ here is that it represents the probability that the adversary can just guess the key $k$ that was used to sign the message-tag pairs. This is a strategy which can always be employed and we consider the MAC system secure if no other strategy can do marginally better.
+```
 
-for some negligible $\epsilon$.
+Sometimes, a stronger notion of security is also used in order to take into account the scenario where the adversary might find a valid tag $\tau'$ for a valid message-tag pair $(m, \tau)$.
+
+```admonish danger title="Definition: Strong Unforgeability"
+A CMA-secure MAC system has *strong unforgeability* if for every efficient adversary $\textit{Eve}$ and any valid message-tag pair $(m, \tau)$ signed with a key $k$, the probability that $\textit{Eve}$ can find a second tag $\tau'$ such that $\textit{Verify}(k,m, \tau') = \textit{Verify}(k,m, \tau) = 1$ at most $\frac{1}{|\mathcal{K}|} + \epsilon(n)$ for some negligible $\epsilon$, i.e.
+
+$$\Pr[\textit{Verify}(k, m, \textit{Eve}(m, \tau)) = 1] \le \frac{1}{2^n} + \epsilon(n)$$
 ```
 
 ```admonish tip title="Definition Breakdown"
-The adversary $\textit{Eve}$ is free to choose the messages $m_1,m_2,...,m_q$ and is then presented with their tags $t_1, t_2, ..., t_q$ which are signed with the secret key $k$, i.e. $\tau_i \leftarrow \textit{Sign}(k, m_i)$. The attacker then produces a new candidate pair $(m, \tau)$, called an *existential forgery*, with the goal that this pair fools $\textit{Verify}$ when checked with the secret key $k$. The MAC system is secure if the existential forgery can fool $\textit{Verify}$ with only an extremely small probability.
+Once again, $\frac{1}{2^n}$ is the probability that $\textit{Eve}$ can just guess the key which was used to sign the initial message-tag pair. Strong unforgeability entails that there is no strategy which can do marginally better than this.
 ```
 
-The good thing about this notion of security is that it captures the case where the adversary is trying to find a second valid tag $\tau'$ for the same message $m$ with an already known tag $\tau$. However, this definition provides no protection against *replay attacks*.
+This stronger security notion is essential for some applications, but it can be safely ignored for others, hence why it is a separate definition. 
+
+```admonish note
+Strong unforgeability builds on top of CMA-security. No MAC system can have strong unforgeability without being CMA-secure.
+```
 
 ### Replay Attacks
 A replay attack describes the scenario where the adversary eavesdropping on the communication channel has captured a bunch of valid message-tag pairs and later sends, or *replays*, them again. Since the pairs were generated by an authentic party and are merely being resent again by a malicious actor, they will pass verification at the receiving end with no problem.
@@ -60,6 +74,69 @@ A replay attack describes the scenario where the adversary eavesdropping on the
 Image that Alice really does want to transfer 100€ to Bob's account, so she sends an authentic request with a valid tag to the bank. However, if Bob copies this request on its way to the bank, Bob can later pretend to be Alice by sending the exact same message with the same valid tag and the bank will think this is a legitimate request and will transfer another 100€ to Bob's account.
 ```
 
-Message authentication codes on their own provide no protection mechanisms against such attacks which is why additional measures must be implemented.
+Message authentication codes on their own provide *no* protection mechanisms against such attacks which is why additional measures must be implemented.
 
 # Implementing MACs
+Before implementing a MAC system, it is useful to talk about the intrinsics of its $\textit{Sign}$ algorithm. The signing function can be either deterministic or non-deterministic.
+
+If $\textit{Sign}$ is deterministic, given the same message $m$ and using the same key $k$, $\textit{Sign}(k, m) = \tau$ will always output the same tag $\tau$. This is quite useful because it means that one does not have to get particularly imaginative with the verification algorithm. The $\textit{Verify}$ function will take the received message $m_r$ and generate a tag $\tau_g = \textit{Sign}(k, m_r)$  by signing the received message with the secret key. If the generated tag $\tau_g$ matches the tag $\tau_r$ received with the message, then the message is accepted.
+
+![](Resources/Images/MACs/Deterministic%20MAC.svg)
+
+On the other hand, if the signing algorithm is non-deterministic, that means that it uses internal randomness in the signing process and so $\textit{Sign}(k, m)$ will *not* necessarily produce the same tag $\tau$ when passed the same key and message as inputs. This means that the canonical verification algorithm for deterministic MACs no longer works and we have to get more creative with $\textit{Verify}$.
+
+## Implementing MACs
+[Pseudorandom function generators (PRFGs)](../Pseudorandom%20Generators%20(PRGs)/Pseudorandom%20Function%20Generators%20(PRFGs).md) are an excellent tool for creating deterministic MAC signing algorithms. 
+
+### Fixed-Length MACs
+This is the most basic type of MAC system which uses pseudorandom function generators. A fixed-length MAC uses keys and messages that are of the same length $n$ and also produce tags with length $n$. Indeed, they are very limited because they require long keys for long messages and produce equally long tags which is a problem because bandwidth is limited. Nevertheless, fixed-length MACs can be used to implement more sophisticated and useful systems.
+
+The signing algorithm of a fixed-length MAC $\textit{Sign}(\textit{key}: \textbf{str}[n], \textit{message}: \textbf{str}[n]) \to \textbf{str}[n]$ can be any pseudorandom function generator $\textit{PRFG}(\textit{seed}: \textbf{str}[n], \textit{idb}: \textbf{str}[n]) \to \textbf{str}[n]$ where the secret key $k$ is used as the seed and the message $m$ is the input data block, i.e.
+
+$$\textit{Sign}(k, m) \coloneqq \textit{PRFG}(k, m)$$
+
+Since the signing algorithm is just a PRFG, this is a deterministic MAC system and so we can just use the trivial verification algorithm for $\textif{Verify}$, i.e.
+
+```rust
+fn Verify(key: str[n], message: str[n], tag: str[n]) -> bool {
+	generated_tag = Sign(key, message);
+	return generated_tag == tag;
+}
+```
+
+Indeed, this construction turns out to be a secure MAC system so long as the PRFG used for signing is secure.
+
+~~~admonish check collapsible=true title="Proof: Security of Fixed-Length MACs"
+Suppose, towards contradiction, that there is an efficient adversary $\mathcal{A}$ which can query the pseudorandom function $\textit{Sign}_k$, obtained from $\textit{PRFG}$ with a seed $k$, with $q = \textit{poly}(n)$ messages and can thus get the message-tag pairs $(m_1, \tau_1), (m_2, \tau_2), ..., (m_q, \tau_q)$. The adversary $\mathcal{A}$ then produces a valid existential forgery $(m, \tau)$ with probability non-negligibly greater than $\frac{1}{|\mathcal{K}|}$, i.e.
+
+$$\Pr[\textit{Sign}(k, m) = \tau] \gt \frac{1}{2^n} + \xi(n)$$
+
+for some non-neglgible $\xi(n)$. We can use this adversary to construct a distinguisher $D$ which can tell apart a PRF from a random function with non-negligible probability. Indeed, suppose that $\mathcal{A}$ is given oracle access to some function $\mathcal{O}$ which is either $\textit{Sign}_k$ or a truly random function, but $\mathcal{A}$ does not know which it is.
+
+The distinguisher $D$ is the following.
+
+```rust
+fn D() -> bit {
+	let existential_forgery = A(); // A performs q queries and returns an existential forgery
+	
+	if existential_forgery.tag == O(existential_forgery.message) {
+		return 1;
+	}
+	else {
+		return 0;
+	}
+}
+```
+
+If the oracle function $\mathcal{O}$ is indeed $\textit{Sign}$, then the probability that the tag $\tau$ of the existential forgery equals $\mathcal{O}(m) \equiv \textit{Sign}_k(m)$, where $m$ is the message of the existential forgery, is greater than $\frac{1}{2^n} + \xi(n)$ and so is the probability that $D$ outputs $1$.
+
+On the other hand, if the oracle function $\mathcal{O}$ is some truly random function $H$, then the probability that the tag $\tau$ of the existential forgery equals $\mathcal{O}(m) \equiv H(m)$, where $m$ is the message of the existential forgery, is just $\frac{1}{2^n}$, since the function $H$ is truly random and the powers of $\mathcal{A}$ are useless against it due to its lack of information about the function. 
+
+Therefore,
+
+$$\begin{align} \left|\Pr[D(Sign_k) = 1] - \Pr_{H \leftarrow_R (\{0,1\}^n \to \{0,1\}^n)}[D(H) = 1] \right| &\gt \\ &\gt\frac{1}{2^n} + \xi(n) - \frac{1}{2^n} \\ &\gt \xi(n)\end{align}$$
+
+Since $\xi(n)$ is non-negligible, this contradicts the fact that $\text{Sign}_k$ is a pseudorandom function.
+~~~
+
+Despite being very limited themselves, fixed-length MACs can be used to construct much better MAC systems.
diff --git a/...ate-Key Cryptography/One-Time Passwords/HMAC-Based One-Time Passwords (HOTP).md b/...ate-Key Cryptography/One-Time Passwords/HMAC-Based One-Time Passwords (HOTP).md
@@ -0,0 +1,2 @@
+# Introduction
+TODO
diff --git a/...ate-Key Cryptography/One-Time Passwords/Time-Based One-Time Passwords (TOTP).md b/...ate-Key Cryptography/One-Time Passwords/Time-Based One-Time Passwords (TOTP).md
@@ -0,0 +1,2 @@
+# Introduction
+Time-based one-time password (TOTP) systems provide a concrete solution for preventing base index repetition. TODO
diff --git a/Notes/Cryptography/Private-Key Cryptography/One-Time Passwords/index.md b/Notes/Cryptography/Private-Key Cryptography/One-Time Passwords/index.md
@@ -0,0 +1,47 @@
+# Introduction
+Two-factor authentication is ubiquitous in contemporary authentication systems. One of the methods used for 2FA are the so-called *authenticator apps*. Whenever the server needs to validate that it really is you who is trying to log in, you just open the app and it magically produces a code which you can enter and the server magically accepts it! Furthermore, a new code appears after a given period of time, usually 30-60 seconds. 
+
+But how does the authenticator app know what code to give and how does the server know when the code is correct? 
+
+# One-Time Passwords
+The code generated by the authenticator app is called a *one-time password*. Whenever you set up 2FA on your account for the first time, you will be asked to either scan a QR code with the application or manually enter an alphanumeric string into the authenticator application, called a *seed*, which is then stored on both the server and in your authenticator app. This seed should *never* be shared with anyone else.
+
+From then on, one-time passwords are generated using a [pseudorandom function generator (PRFG)](../../Pseudorandom%20Generators%20(PRGs)/Pseudorandom%20Function%20Generators%20(PRFGs).md). One example procedure for a one-time password authentication uses a publicly known one-bit PRFG $G(seed: \textbf{str}[S], index: \textbf{int}[0..2^S]) \to \textbf{bit}$. Whenever you log in, the server sends a random base index $i_0$, which is an integer between $0$ and $2^S - 1$ inclusively, and a security parameter $l$. Your authenticator app then uses the secret seed $s$ and the PRFG $G$ to generate $l$ bits, starting from the base index the server provided. The one-time password is then simply the concatenation of the bits $G(s, i_0)G(s,i_0 + 1)\cdots G(s,i_0 + l - 1)$. This resulting binary string can be converted into a decimal number so that it is easy for a human, i.e. you, to write it in the prompt on the log-in page.
+
+When the server receives your code, it generates its own code by using the secret seed $s$, the same base index $i_0$ and the same security parameter $l$. It then compares its own code with the code you sent and if they match, you are authenticated. Since both used the exact same base index and security parameter, the only way for your code to match the server's is if you also used the same secret seed $s$, thus proving your authenticity. 
+
+```admonish note
+In practice, one-time password systems use PRFGs which output more than a single bit.
+```
+
+## Security of One-Time Passwords
+What does it mean for a one-time password system to be secure? Well, the server either rejects or accepts your log in depending on the code you sent it. An adversary won't have access to the secret seed, so the most basic strategy, which is always possible to do, is to attempt to guess the code. The probability of the adversary just guessing the code is $\frac{1}{2^l}$, since there are a total of $2^l$ possible codes. This motivates the following definition of security for one-time passwords.
+
+```admonish danger title="Definition: Security of One-Time Passwords"
+A one-time password system with a seed $s$ of length $S$, base index $i_0 \in \{0,1,..., 2^S - 1\}$ and security parameter $l$ is *secure* if for every efficient adversary $\textit{Eve}(i_0: \textbf{int}[0..2^S], l: \textbf{int}) \to \textbf{str}[l]$ who knows the base index and the security parameter, the probability that $\textit{Eve}$ will be authenticated by the server without knowledge of the secret seed is at most $\frac{1}{2^l} + \epsilon(S)$ for some neglgigible $\epsilon$, i.e.
+
+$$\Pr[\textit{Server}(\textit{Eve}(i_0, l)) = \text{ authenticated }] \le \frac{1}{2^l} + \epsilon(S)$$
+```
+
+```admonish tip title="Definition Breakdown"
+A one-time password system is secure if there is no adversary that, given the base index $i_0$ and security parameter $l$, can guess what code the server will generate with probability marginally better than $\frac{1}{2^l}$.
+```
+
+From this definition we see that the security of a one-time password heavily depends on the security of the parameter $l$. If security is to be achieved, the security parameter must be at most as long as the seed, i.e. $l \le S$. Otherwise, an adversary can attempt to simply guess the seed with probability $\frac{1}{2^S}$. Since the seed would be shorter than the security parameter, there would be fewer possible seeds than possible codes and $\frac{1}{2^S}$ would be non-negligibly greater than $\frac{1}{2^l}$. However, making the security parameter short, i.e. $l \lt S$, is also unreasonable since it would increase the overall likelihood that an adversary guesses the code. Ergo, the Goldilocks value for the security parameter is the length of the seed, i.e. $l = S$.
+
+Indeed, using this definition, we can prove that the aforementioned one-time password system is secure so long as the PRFG it uses is.
+
+```admonish check collapsible=true title="Proof: Security of Example One-Time Password"
+TODO
+```
+
+### Replay Attacks
+It is paramount that the same base index is never used twice in order to thwart replay attacks. If an adversary eavesdrops on the connection between you and the server, they can store the base index and the code you send to the server in every two-factor authentication session.
+
+The adversary can later try to authenticate and if the server sends them a base index which they previously recorded from you, then they also know the correct code for this index and will successfully authenticate. 
+
+```admonish warning
+The same base index should never be reused.
+```
+
+A random base index is just a fairly easy way to achieve this non-repetition of indices because even if the index is just 128 bits in length, the probability that the same index will be reused is $\frac{1}{2^128}$, which is ridiculously low.
diff --git a/...yptography/Private-Key Cryptography/Resources/Images/MACs/Deterministic MAC.svg b/...yptography/Private-Key Cryptography/Resources/Images/MACs/Deterministic MAC.svg
diff --git a/Notes/Cryptography/Private-Key Cryptography/Stream Ciphers/index.md b/Notes/Cryptography/Private-Key Cryptography/Stream Ciphers/index.md
@@ -30,7 +30,7 @@ The purpose of the initialisation vector is to allow for key reuse. So long as t
 # Security
 A stream cipher is [semantically-secure](../Security%20Notions/Semantic%20Security.md) so long as it uses a [secure PRG](../../Pseudorandom%20Generators%20(PRGs)/index.md#admonition-definition-secure-pseudorandom-generator-prg).
 
-```admonish check title="Proof: Semantic Security of Stream Ciphers"
+```admonish check collapsible=true title="Proof: Semantic Security of Stream Ciphers"
 We are given a stream cipher $(\textit{Enc},\textit{Dec})$ which uses a secure pseudorandom generator $\textit{Gen}(seed: \textbf{str}[S]) \to \textbf{str}[R]$ under the hood and we need to prove that the cipher is semantically secure.
 
 Essentially, it all boils down to the security of the one-time pad. If instead of using a generator the message $m_b$ was XOR-ed with a truly random string $r \leftarrow_R \{0,1\}^l$, then we get a one-time pad which is perfectly secret (and by extension also semantically secure), i.e.
@@ -41,7 +41,7 @@ Suppose, towards contradiction, that there was an adversary $\mathcal{A}$ which
 
 $$\Pr_{k\leftarrow_R \mathcal{K}, b \leftarrow_R \{0,1\}}[\mathcal{A}(\textit{Enc}(m_b)) = m_b] \gt \frac{1}{2} + \xi(n)$$
 
-for some non-negligible $\Xi(n)$. This can be rewritten as
+for some non-negligible $\xi(n)$. This can be rewritten as
 
 $$\Pr_{k\leftarrow_R \mathcal{K}, b \leftarrow_R \{0,1\}}[\mathcal{A}(\textit{Gen}(s) \oplus m_b) = m_b] \gt \frac{1}{2} + \xi(n)$$
 

diff --git a/...aphy/Pseudorandom Generators (PRGs)/Pseudorandom Function Generators (PRFGs).md b/...aphy/Pseudorandom Generators (PRGs)/Pseudorandom Function Generators (PRFGs).md
@@ -60,4 +60,7 @@ fn PRFG(seed: str[S], idb: str[S]) -> str[l_out] {
 	let f = get_function_from_seed(seed);
 	return f(idb);
 }
-```
+```
+
+## PRFGs from PRGs
+Okay but how can we construct a PRFG algorithm? Well, as it turns out a [pseudorandom generators](index.md) can be used to construct such algorithms. In particular, a PRG $G(seed: \textbf{str}[S]) \to \textbf{str}[2S]$, which takes a seed of length $S$ and outputs a pseudorandom string of double that length, can be used to construct a pseudorandom function generator $PRFG(seed: \textbf{str}[S], idb: \textbf{str}[S]) \to \textbf{str}[S]$. TODO