Attacking RSA for fun and CTF points

Attacking-RSA - This article is part of a series.

Part 1: This Article

Part 2: Attacking RSA for fun and CTF points - part 2

Part 3: Attacking RSA for fun and CTF points - part 3

Part 4: Attacking RSA for fun and CTF points - part 4

Introduction
#

RSA is my favorite cryptosystem. 🙂 It’s simple and powerful.

In this series I will try to go through every attacks (that I’m aware of) against RSA which are useful for solving CTF tasks. I’m not going to give you scripts that will do all the work for you but rather explain how the attacks work. The aim of this series is to understand the attacks you use and which one is most appropriate depending on the task. I will try to be beginner friendly and repeat myself in the beginning but afterwards I will assume that the reader has learnt the concepts.

Before entering into the details on how to break RSA based challenges, let’s see how textbook RSA works. I’m not talking about RSA used with padding as it is in real world cryptography.

How simple maths can keep your data private
#

RSA is based on simple modular arithmetics. It doesn’t require a lot of maths knowledge to understand how it works. As it’s an asymmetric cipher, you have two keys, a public key containing the couple ($n$, $e$) and a private key containing a bunch of information but mainly the couple ($n$, $d$).

Here comes the most important part, this must be fully understood in order to understand the attacks that will be described. The modulus $n$ is chosen from two primes:

$n=p*q$ with $p$ and $q$ being prime integers (chosen carefully)

To choose a valid public exponent $e$ you first need to compute $\varphi(n) = (p-1)(q-1)$, also called Euler’s totient function.

Note: Now, in real world RSA, you don’t use Euler’s totient function but rather Carmichael’s totient function.

Then $e$ must verify those 2 conditions:

$1<e<\varphi(n)$ and $gcd(e, \varphi(n))=1$, that means that $e$ and $\varphi(n)$ are coprime. Usually, we choose $e = 65537$.

Now you have a valid public key, we just need to compute the private exponent $d$ as the modular inverse of $e$ modulo $\varphi(n)$. In short, find a value for $d$ such that $d*e \equiv 1 \mod \varphi(n)$

Note: Keys are usually encoded in PEM or DER format but it’s not relevant for this example.

The encryption process
#

Let’s say you want to encrypt this message with your public key : “My credit card number is 1337”.

Public key :

n = 30994968412821274638126108542140224647370292100079091608343041083209715023181825537637957453183815788151099869840363450721
e = 65537

You’ll first need to transform it into a number. Usually it’s done like this :

"My credit card number is 1337".encode('hex')
# '4d79206372656469742063617264206e756d6265722069732031333337'
m = 0x4d79206372656469742063617264206e756d6265722069732031333337
m
# 2088672004503895363248317162088008321096572194316716175821104101929783

You then compute $c = m^e \mod n$ where $c$ is the resulting ciphertext. Only you (who have the private key) will be able to decrypt it. At least in theory, because that’s what this series is all about, attacking RSA. 🙂

c = pow(m,e,n)
c
3740808283126743789473658216888004237756151970385422112230702175214670415045578511813428786937523016996521109011952458274

The decryption process
#

To decrypt a ciphertext you need to know the private key corresponding to the public key used to encrypt it. Private key :

n = 30994968412821274638126108542140224647370292100079091608343041083209715023181825537637957453183815788151099869840363450721
d = 10949944362147351445695313961215384000802056441294706923101734114824865877971959648683318864984560110549528540371119079473

Once you have everything, you do the same operation as for the encryption except you use $d$ as the exponent instead of $e$ :

$m = c^d \mod n$ where $m$ is the resulting plaintext.

m_ = pow(c,d,n)
m_
# 2088672004503895363248317162088008321096572194316716175821104101929783
assert m == m_
hex(m)
# '0x4d79206372656469742063617264206e756d6265722069732031333337'
"4d79206372656469742063617264206e756d6265722069732031333337".decode('hex')
# 'My credit card number is 1337'

Key structure
#

I told you at the beginning that key are usually encoded in PEM format. Here is what they really look like :

-----BEGIN PUBLIC KEY-----
MFwwDQYJKoZIhvcNAQEBBQADSwAwSAJBAIw/U51Fghh6WumZQjg9l3a6AjFZ+xm2
x2+9ja+8n8Yg95Hbxsp9vCpwlIol1A5wMo6p/hNlxzAE3/cY08eKzDMCAwEAAQ==
-----END PUBLIC KEY-----

I’m not going to detail the format used. PEM is just ASN.1 data put between headers. ASN.1 is a formatted binary representation of data and it uses base64 for the final encoding.

You can use an online decoder, OpenSSL or Python to extract the data from the keys.

from Crypto.PublicKey import RSA

f = open('public.pem','r')
key = RSA.importKey(f.read())
print(key.n)
print(key.e)

Let’s verify that the public key only contains $n$ and $e$.

This above is the ASN.1 structure of the public key. There is some metadata indicating the encryption type and at the end there are two integers: the 512 bit $n$ and the classic $e = 65537$.

I told you that a private key contains more than just $d$ and $n$. This is the ASN.1 syntax for PKCS1:

An RSA private key should be represented with the ASN.1 type
RSAPrivateKey:

  RSAPrivateKey ::= SEQUENCE {
      version           Version,
      modulus           INTEGER,  -- n
      publicExponent    INTEGER,  -- e
      privateExponent   INTEGER,  -- d
      prime1            INTEGER,  -- p
      prime2            INTEGER,  -- q
      exponent1         INTEGER,  -- d mod (p-1)
      exponent2         INTEGER,  -- d mod (q-1)
      coefficient       INTEGER,  -- (inverse of q) mod p
      otherPrimeInfos   OtherPrimeInfos OPTIONAL
  }

You’re free to try it out yourself just like we did with the public key. 🙂

Factorizing the public key
#

The strength of RSA relies on the fact that you need to factor $n$ to obtain $d$ and there is no known algorithm that can do that efficiently for large numbers. Why do you have to factor $n$ ? Because $d*e \equiv 1 \mod \varphi(n)$ and the only way to get $\varphi(n)$ is by knowing $p$ and $q$.

You can perform a brute force attack to determine a possible $p$. This can be done when $n$ is very small. The recommendation is to use an $n$ which is at least 2048 bits long.

Even if you need to break a bigger $n$, sometimes it’s a known one and it’s factors are already in a database !

Example :

n = 9567648541342714273618397561214215397959

it’s a known factor number :

p = 70636931
q = 135448247905089679980836052478189

The first thing to try is to see if $n$ is known, it’s a quick check that might save you a lot of time. 😉

Keep in mind that this is very basic and usually you wont be able to factor the modulus.

If you can’t factor the modulus, what can you do ? Simply abuse the flaws in the implementation or the choice of the values ! There is so many ways one can screw up its implementation of RSA. 😁

Common modulus
#

One such way is using the same modulus with different public exponents.

As an external attacker
#

In this scenario, imagine that several persons are communicating over a public network. Because they didn’t want to generate a different modulus for each person, they decided that every person will have the same modulus. Each user $i$ will have a unique public key ($N$, $e_i$) and unique private key ($N$, $d_i$). You managed to intercept an important message $M$ sent to users A and B. But because it’s encrypted, you have in your possession $C_A$ and $C_B$. How can you recover it ?

Compute $gcd(e_A,e_B)$, it should be equal to 1. Using Bézout’s identity we know that there exist $u$ and $v$ so that $e_A*u + e_B*v = 1$. You can compute $u$ and $v$ using the extended Euclidean algorithm.

$C_A = M^{e_A} \mod N$ and $C_B = M^{e_B} \mod N$

$C_A^u = M^{e_A^u} = M^{e_A*u} \mod N$ do the same to $C_B$

You can now multiply both to make the Bézout’s identity appear.

$M^{e_A*u} * M^{e_B*v} = M^{e_A*u+e_B*v} = M^1 = M \mod N$

Example :

n = 19085995833312192524007220630153244389942263922006889142154298425751808612835625879164268530070480609

Here are the public keys of person 1 and 2 and the ciphertexts you intercepted:

e1 = 31
e2 = 71
c1 = 6754157603566559210605055806173167464578011342930319568190139207096747909338872956835503565519657656L
c2 = 15442865769085690326152463737212582797117727243803209188030346754687972404658825954014788039636105165L

Apply Bézout’s identity and find $u = 55$ and $v = -24$. Because $v < 0$ you’ll need to compute the inverse of $c_2$ modulo $n$ which is :

c2_inv = 12909978039651622455828981512398791612880793088232603583312672024505111979731377532780209633970663146

Now you just compute $M = c_1^{55}(\frac{1}{c_2})^{24}$ which gives you the decrypted message :

M
# 101519529085530394070280463104338208011199968387105
hex(M)
# '0x45766520697320737079696e67206f6e2075732021'
"45766520697320737079696e67206f6e2075732021".decode('hex')
# 'Eve is spying on us !'

As an internal attacker
#

Previous case required you to intercept two identical messages. In this case it’s assumed that you are part of the group and in possession of a public and private key with the same modulus as the other persons. You will be able to decrypt every message you intercept on the network. How ?

You know from $e*d \equiv 1 \mod \varphi(n)$ that $e*d - 1$ is a multiple of $\varphi(n)$. In other words $(e*d-1) = k*\varphi(n)$.

You can approximate $\varphi(n)$ to simply $n$ and compute $k$ by rounding up the result:

$k = \frac{e*d-1}{n}$.

Then $\varphi(n) = \frac{e*d-1}{k}$.

If the result is not an integer, increment $k$ until it is. That’s because of the approximation.

Now that you know $\varphi(n)$ and the public exponent of your victims, you can compute the corresponding private keys $d_{victim} = e_{victim}^{-1} \mod \varphi(n)$ and decrypt the messages.

Example:

Ignore the fact that this $n$ has known factors (or use that to verify the results).

n = 1249110767794010895540410194153

The public key of the CEO is :

e_CEO = 3

Your key pair is :

e = 65537
d = 205119704640110252892051812353

k = ((e*d)-1)/n
k
# 10761

Try to compute $\varphi(n)$ :

phi = ((e*d)-1)/k
phi
# 1249226845367429202098912705713

Verify that you got the right value for $\varphi(n)$ :

phi*k == ((e*d)-1)
# False

It’s not, so you must increment $k$ and retry :

k = k+1
phi = ((e*d)-1)/k
phi*k == ((e*d)-1)
# True
phi
# 1249110767793988630717933434880

Now you can compute the private key of your CEO ! 😁

d_CEO = 832740511862659087145288956587

You can even completely factor $n$ if you want but I won’t describe how, because at this point we already won. 😁

Decipher oracle
#

Sometimes you will encounter special services offering to decrypt any ciphertext you give them. Any, except for one specific ciphertext (usually the flag). Such services are called “Oracles” because they give you additional information you wouldn’t be able to get otherwise. There are other attacks that abuse oracles like padding oracle and LSB oracle.

Note: As you may guess this is not something you will often encounter in reality because what’s the point of it ? Moreover, this attack only works with textbook RSA because the use of padding makes it not exploitable.

With that said, take a look at how you can craft a ciphertext to trick the oracle to give you the flag.

You send a ciphertext $c$ to the server and receive a plaintext $m$ in return. You already know that the server computes $m = c^d \mod n$ with $c = \text{flag}^e \mod n$.

Because the server check that you don’t ask for the decryption of the flag, you can’t give him $c$ right away, you need to modify it in a way to trick him into thinking it’s something else. Your modification must be carefully chosen so you can revert the process once you get the response of the server.

For instance, you can’t just add one and expect to subtract 1 from the output : ${(c+1)}^d \neq \text{flag}+1$

The trick is the multiply the ciphertext with another ciphertext $c_2$ from which we know the plaintext. $c_2 = 2^e \mod n$. You will understand why I chose 2 as the plaintext once you see the result of the server.

Now the new ciphertext that you will send to the server will be $C = c*c_2 = {\text{flag}^e}*{2^e} = (2*\text{flag})^e$

The server will give you back $m = ((2*\text{flag})^e)^d \mod n = 2*\text{flag}$

Now you can divide $m$ by 2 and get the flag. 🙂

You must choose a small value because the computations are made modulo $n$, so if the result gets to big you wont be able to know the real value.

Conclusion
#

This first part was oriented around the basics of RSA so everyone could follow and maybe learn something. The two attacks described here are among the easiest to exploit as they don’t require tools or a lot of coding. It was a lot of recaps for more advanced readers, so I hope you didn’t get bored. 🙂

In the next parts I will focus on more advanced attacks.

Attacking-RSA - This article is part of a series.

Part 1: This Article

Part 2: Attacking RSA for fun and CTF points - part 2

Part 3: Attacking RSA for fun and CTF points - part 3

Part 4: Attacking RSA for fun and CTF points - part 4

Introduction#

How simple maths can keep your data private#

The encryption process#

The decryption process#

Key structure#

Factorizing the public key#

Common modulus#

As an external attacker#

As an internal attacker#

Decipher oracle#

Conclusion#