Introduction#
RSA is my favorite cryptosystem. 🙂 It’s simple and powerful.
In this series I will try to go through every attacks (that I’m aware of) against RSA which are useful for solving CTF tasks. I’m not going to give you scripts that will do all the work for you but rather explain how the attacks work. The aim of this series is to understand the attacks you use and which one is most appropriate depending on the task. I will try to be beginner friendly and repeat myself in the beginning but afterwards I will assume that the reader has learnt the concepts.
Before entering into the details on how to break RSA based challenges, let’s see how textbook RSA works. I’m not talking about RSA used with padding as it is in real world cryptography.
How simple maths can keep your data private#
RSA is based on simple modular arithmetics. It doesn’t require a lot of maths knowledge to understand how it works. As it’s an asymmetric cipher, you have two keys, a public key containing the couple ($n$, $e$) and a private key containing a bunch of information but mainly the couple ($n$, $d$).
Here comes the most important part, this must be fully understood in order to understand the attacks that will be described. The modulus $n$ is chosen from two primes:
$n=p*q$ with $p$ and $q$ being prime integers (chosen carefully)
To choose a valid public exponent $e$ you first need to compute $\varphi(n) = (p-1)(q-1)$, also called Euler’s totient function.
Then $e$ must verify those 2 conditions:
$1<e<\varphi(n)$ and $gcd(e, \varphi(n))=1$, that means that $e$ and $\varphi(n)$ are coprime. Usually, we choose $e = 65537$.
Now you have a valid public key, we just need to compute the private exponent $d$ as the modular inverse of $e$ modulo $\varphi(n)$. In short, find a value for $d$ such that $d*e \equiv 1 \mod \varphi(n)$
The encryption process#
Let’s say you want to encrypt this message with your public key : “My credit card number is 1337”.
Public key :
n = 30994968412821274638126108542140224647370292100079091608343041083209715023181825537637957453183815788151099869840363450721
e = 65537
You’ll first need to transform it into a number. Usually it’s done like this :
"My credit card number is 1337".encode('hex')
# '4d79206372656469742063617264206e756d6265722069732031333337'
m = 0x4d79206372656469742063617264206e756d6265722069732031333337
m
# 2088672004503895363248317162088008321096572194316716175821104101929783
You then compute $c = m^e \mod n$ where $c$ is the resulting ciphertext. Only you (who have the private key) will be able to decrypt it. At least in theory, because that’s what this series is all about, attacking RSA. 🙂
c = pow(m,e,n)
c
3740808283126743789473658216888004237756151970385422112230702175214670415045578511813428786937523016996521109011952458274
The decryption process#
To decrypt a ciphertext you need to know the private key corresponding to the public key used to encrypt it. Private key :
n = 30994968412821274638126108542140224647370292100079091608343041083209715023181825537637957453183815788151099869840363450721
d = 10949944362147351445695313961215384000802056441294706923101734114824865877971959648683318864984560110549528540371119079473
Once you have everything, you do the same operation as for the encryption except you use $d$ as the exponent instead of $e$ :
$m = c^d \mod n$ where $m$ is the resulting plaintext.
m_ = pow(c,d,n)
m_
# 2088672004503895363248317162088008321096572194316716175821104101929783
assert m == m_
hex(m)
# '0x4d79206372656469742063617264206e756d6265722069732031333337'
"4d79206372656469742063617264206e756d6265722069732031333337".decode('hex')
# 'My credit card number is 1337'
Key structure#
I told you at the beginning that key are usually encoded in PEM format. Here is what they really look like :
-----BEGIN PUBLIC KEY-----
MFwwDQYJKoZIhvcNAQEBBQADSwAwSAJBAIw/U51Fghh6WumZQjg9l3a6AjFZ+xm2
x2+9ja+8n8Yg95Hbxsp9vCpwlIol1A5wMo6p/hNlxzAE3/cY08eKzDMCAwEAAQ==
-----END PUBLIC KEY-----
-----BEGIN RSA PRIVATE KEY-----
MIIBOQIBAAJBAIw/U51Fghh6WumZQjg9l3a6AjFZ+xm2x2+9ja+8n8Yg95Hbxsp9
vCpwlIol1A5wMo6p/hNlxzAE3/cY08eKzDMCAwEAAQJAJearQxJYwSK31O9dDPPg
Le7AzvOBP4a8yP7R/o8cIp+3XdCXzuUreFzTWTXIg76tohg8cQb77HT/jVo2rLXa
AQIhAOrtFkJ0So2NZIp4xBPLqFozaSJNti8Yx8w1IOWoS2szAiEAmNQCPrBaB6p4
heIDYgaTYpJa4gbw3tLe82AAKzFLGwECIE/ZA37Uzd4s16ZlA6gCyZbW8H3zUd/S
GV6kFClauT+XAiBZuddbkNQ6vfYmvIw56Bxt+flLzMFsQSfOgaV3tmgfAQIgKW7C
LI1+rBn3TvmyLMZ7+3TEtVeTVRgabLWyOUjmv7w=
-----END RSA PRIVATE KEY-----
I’m not going to detail the format used. PEM is just ASN.1 data put between headers. ASN.1 is a formatted binary representation of data and it uses base64 for the final encoding.
You can use an online decoder, OpenSSL or Python to extract the data from the keys.
from Crypto.PublicKey import RSA
f = open('public.pem','r')
key = RSA.importKey(f.read())
print(key.n)
print(key.e)
Let’s verify that the public key only contains $n$ and $e$.
This above is the ASN.1 structure of the public key. There is some metadata indicating the encryption type and at the end there are two integers: the 512 bit $n$ and the classic $e = 65537$.
I told you that a private key contains more than just $d$ and $n$. This is the ASN.1 syntax for PKCS1:
An RSA private key should be represented with the ASN.1 type
RSAPrivateKey:
RSAPrivateKey ::= SEQUENCE {
version Version,
modulus INTEGER, -- n
publicExponent INTEGER, -- e
privateExponent INTEGER, -- d
prime1 INTEGER, -- p
prime2 INTEGER, -- q
exponent1 INTEGER, -- d mod (p-1)
exponent2 INTEGER, -- d mod (q-1)
coefficient INTEGER, -- (inverse of q) mod p
otherPrimeInfos OtherPrimeInfos OPTIONAL
}
You’re free to try it out yourself just like we did with the public key. 🙂
Factorizing the public key#
The strength of RSA relies on the fact that you need to factor $n$ to obtain $d$ and there is no known algorithm that can do that efficiently for large numbers. Why do you have to factor $n$ ? Because $d*e \equiv 1 \mod \varphi(n)$ and the only way to get $\varphi(n)$ is by knowing $p$ and $q$.
You can perform a brute force attack to determine a possible $p$. This can be done when $n$ is very small. The recommendation is to use an $n$ which is at least 2048 bits long.
Even if you need to break a bigger $n$, sometimes it’s a known one and it’s factors are already in a database !
Example :
n = 9567648541342714273618397561214215397959
it’s a known factor number :
p = 70636931
q = 135448247905089679980836052478189
The first thing to try is to see if $n$ is known, it’s a quick check that might save you a lot of time. 😉
Keep in mind that this is very basic and usually you wont be able to factor the modulus.
If you can’t factor the modulus, what can you do ? Simply abuse the flaws in the implementation or the choice of the values ! There is so many ways one can screw up its implementation of RSA. 😁
Common modulus#
One such way is using the same modulus with different public exponents.
As an external attacker#
In this scenario, imagine that several persons are communicating over a public network. Because they didn’t want to generate a different modulus for each person, they decided that every person will have the same modulus. Each user $i$ will have a unique public key ($N$, $e_i$) and unique private key ($N$, $d_i$). You managed to intercept an important message $M$ sent to users A and B. But because it’s encrypted, you have in your possession $C_A$ and $C_B$. How can you recover it ?
Compute $gcd(e_A,e_B)$, it should be equal to 1. Using Bézout’s identity we know that there exist $u$ and $v$ so that $e_A*u + e_B*v = 1$. You can compute $u$ and $v$ using the extended Euclidean algorithm.
$C_A = M^{e_A} \mod N$ and $C_B = M^{e_B} \mod N$
$C_A^u = M^{e_A^u} = M^{e_A*u} \mod N$ do the same to $C_B$
You can now multiply both to make the Bézout’s identity appear.
$M^{e_A*u} * M^{e_B*v} = M^{e_A*u+e_B*v} = M^1 = M \mod N$
Example :
n = 19085995833312192524007220630153244389942263922006889142154298425751808612835625879164268530070480609
Here are the public keys of person 1 and 2 and the ciphertexts you intercepted:
e1 = 31
e2 = 71
c1 = 6754157603566559210605055806173167464578011342930319568190139207096747909338872956835503565519657656L
c2 = 15442865769085690326152463737212582797117727243803209188030346754687972404658825954014788039636105165L
Apply Bézout’s identity and find $u = 55$ and $v = -24$. Because $v < 0$ you’ll need to compute the inverse of $c_2$ modulo $n$ which is :
c2_inv = 12909978039651622455828981512398791612880793088232603583312672024505111979731377532780209633970663146
Now you just compute $M = c_1^{55}(\frac{1}{c_2})^{24}$ which gives you the decrypted message :
M
# 101519529085530394070280463104338208011199968387105
hex(M)
# '0x45766520697320737079696e67206f6e2075732021'
"45766520697320737079696e67206f6e2075732021".decode('hex')
# 'Eve is spying on us !'
As an internal attacker#
Previous case required you to intercept two identical messages. In this case it’s assumed that you are part of the group and in possession of a public and private key with the same modulus as the other persons. You will be able to decrypt every message you intercept on the network. How ?
You know from $e*d \equiv 1 \mod \varphi(n)$ that $e*d - 1$ is a multiple of $\varphi(n)$. In other words $(e*d-1) = k*\varphi(n)$.
You can approximate $\varphi(n)$ to simply $n$ and compute $k$ by rounding up the result:
$k = \frac{e*d-1}{n}$.
Then $\varphi(n) = \frac{e*d-1}{k}$.
If the result is not an integer, increment $k$ until it is. That’s because of the approximation.
Now that you know $\varphi(n)$ and the public exponent of your victims, you can compute the corresponding private keys $d_{victim} = e_{victim}^{-1} \mod \varphi(n)$ and decrypt the messages.
Example:
Ignore the fact that this $n$ has known factors (or use that to verify the results).
n = 1249110767794010895540410194153
The public key of the CEO is :
e_CEO = 3
Your key pair is :
e = 65537
d = 205119704640110252892051812353
k = ((e*d)-1)/n
k
# 10761
Try to compute $\varphi(n)$ :
phi = ((e*d)-1)/k
phi
# 1249226845367429202098912705713
Verify that you got the right value for $\varphi(n)$ :
phi*k == ((e*d)-1)
# False
It’s not, so you must increment $k$ and retry :
k = k+1
phi = ((e*d)-1)/k
phi*k == ((e*d)-1)
# True
phi
# 1249110767793988630717933434880
Now you can compute the private key of your CEO ! 😁
d_CEO = 832740511862659087145288956587
You can even completely factor $n$ if you want but I won’t describe how, because at this point we already won. 😁
Decipher oracle#
Sometimes you will encounter special services offering to decrypt any ciphertext you give them. Any, except for one specific ciphertext (usually the flag). Such services are called “Oracles” because they give you additional information you wouldn’t be able to get otherwise. There are other attacks that abuse oracles like padding oracle and LSB oracle.
With that said, take a look at how you can craft a ciphertext to trick the oracle to give you the flag.
You send a ciphertext $c$ to the server and receive a plaintext $m$ in return. You already know that the server computes $m = c^d \mod n$ with $c = \text{flag}^e \mod n$.
Because the server check that you don’t ask for the decryption of the flag, you can’t give him $c$ right away, you need to modify it in a way to trick him into thinking it’s something else. Your modification must be carefully chosen so you can revert the process once you get the response of the server.
For instance, you can’t just add one and expect to subtract 1 from the output : ${(c+1)}^d \neq \text{flag}+1$
The trick is the multiply the ciphertext with another ciphertext $c_2$ from which we know the plaintext. $c_2 = 2^e \mod n$. You will understand why I chose 2 as the plaintext once you see the result of the server.
Now the new ciphertext that you will send to the server will be $C = c*c_2 = {\text{flag}^e}*{2^e} = (2*\text{flag})^e$
The server will give you back $m = ((2*\text{flag})^e)^d \mod n = 2*\text{flag}$
Now you can divide $m$ by 2 and get the flag. 🙂
You must choose a small value because the computations are made modulo $n$, so if the result gets to big you wont be able to know the real value.
Conclusion#
This first part was oriented around the basics of RSA so everyone could follow and maybe learn something. The two attacks described here are among the easiest to exploit as they don’t require tools or a lot of coding. It was a lot of recaps for more advanced readers, so I hope you didn’t get bored. 🙂
In the next parts I will focus on more advanced attacks.