Skip to main content

Attacking RSA for fun and CTF points - part 4

·7 mins·
Crypto Ctf Rsa
Table of Contents
Attacking-RSA - This article is part of a series.
Part 4: This Article

It’s been a long time since part 3 of this series. So to quietly resume our journey in the beautiful world of mathematics I propose you 4 rather simple topics :

  • Multi-prime RSA
  • Brute force attack on small secret CRT-Exponents
  • Fault attack on signatures
  • Twin primes

Multi-prime RSA
#

In part 1 we saw the basics of RSA and how to generate a key pair using 2 prime numbers, but did you know that it’s theoretically possible to use more than 2 primes to generate a key pair ?

By constructing a key using $k$ distinct primes, you would have :

$$n = \prod_{i=1}^k p_i$$

\[\varphi(n) = \prod_{i=1}^k (p_i-1)\]

For the rest, it’s the same as when using 2 primes.

Example

n = 13*17*89*101
e = 23
phi = 12*16*88*100
d = gmpy2.invert(e,phi)
d
# 587687
m = 31337
c = pow(m,e,n)
c
# 612133
m == pow(c, d, n)
# True

You might wonder, why would anyone want to do that ? Wouldn’t it make it less secure because the primes would be smaller for the same key size ?

While it’s correct that the factors will be smaller if you use more primes for a given key size, it’s not a major security issue as long as they don’t become too small. If the individual primes can be recovered using known factorization algorithm like ECM, then it’s an obvious bad idea. But otherwise, as long as the total size of the modulus is large enough, factorization is not a threat. The more primes you want to use for your key pair, the larger the modulus should be.

Using more than 2 primes offers performance improvements thanks to the Chinese Remainder Theorem (CRT). The speedup is about $k^2$ with $k$ the number of factors used.

Brute force attack on small secret CRT-Exponents
#

We’ve seen that the CRT can be used to speedup RSA encryption and decryption. In this case we are interested in the decryption process.

In order to decrypt RSA using the CRT, you have to compute :

\[\begin{cases}d_p \equiv e^{-1} \mod p-1\\d_q \equiv e^{-1} \mod q-1\end{cases}\]

If the value of $e$ is poorly chosen it’s possible that $d_p$ or $d_q$ is so small that you can find it with pure brute force.

How do we know if we have found the correct value for $d_p$ or $d_q$ though ? And what will we use it for ?

From the definition of $d_p$:

$d_p * e \equiv 1 \mod p-1$

$d_p * e = 1 + k*(p-1)$

Given any number $m$ coprime to $p$ (i.e. $gcd(m, p) = 1$) :

$m^{e*d_p} = m^{1 + k*(p-1)} = m*m^{(p-1)^k}$

From Fermat’s little theorem :

$m*m^{(p-1)^k} \equiv m * 1^k \equiv m \mod p$

So we have :

$m^{e*d_p} \equiv m \mod p$

Or rewritten without the modulo :

$m - m^{e*d_p} = k*p$

As an attacker, you know $e$ and $n$ and you guess $d_p$. To verify your guess, you can choose any number $m < n$ and compute :

$gcd(m - m^{e*d_p}, n) = gcd(k*p, p*q) = p$

For a correct value of $d_p$.

If your guess is incorrect you should get 1 or $n$.

As you can see, correctly guessing $d_p$ leaks a factor of $n$.

This is the same for $d_q$.

Example

n=95580702933509662811256129990158655210667121276245053843875590334281563078868202152845967187641817281520364662600110239110410372520340630639373679599982371620736610194814723749147422221945978800055101110346161945811520158431287139909125886966214800526831490560384144156085296004816333892025839072729987354233
e=1817084480271067137841898198122075168542117135135738925285694555698012943264936112861815937200507849960517390660821911331068907250788900674614345400567411
m = 7516789928765 # random number
for dp in range(1000000):
   f = gmpy2.gcd(m-pow(m, e*dp, n), n)
   if f > 1:
       print(dp, f)
       break

# 187261 8275629468590614667884614599278593237258686111405345888268221129814081809682982742676180514534238891248302334619164139839173447495925780801832743975865311
p = 8275629468590614667884614599278593237258686111405345888268221129814081809682982742676180514534238891248302334619164139839173447495925780801832743975865311
q = n // p
p*q == n
# True

Fault attack on signatures
#

While we are talking about attacks on RSA CRT, let’s focus on the signing process. This process is identical to decryption but instead of applying it on a ciphertext, you apply it on the plaintext $m$ :

\[\begin{cases}s_p = m^{d_p} + k_1p\\ s_q = m^{d_q} + k_2q\end{cases} \Rightarrow s = m^d + k_3pq\]

Partial signatures $s_p$ and $s_q$ are calculated modulo $p$ and $q$ respectively, then the CRT is used to combine both and get the final signature $s$ modulo $n$.

In the beautiful world of mathematics everything goes well every time but that’s sadly not the case in the physical world in which our computers do the maths. Faults can occur during operations.

Without faults during the calculation of the partial signatures $s_p$ and $s_q$, we can do the following observations when verifying the final signature $s$ :

$$\begin{cases}s^e \equiv m \mod p\\s^e \equiv m \mod q\end{cases}$$ $$\begin{cases}s^e - m\equiv 0 \mod p\\s^e - m \equiv 0 \mod q\end{cases}$$ $$\begin{cases}s^e - m= k_1p \\s^e - m = k_2q\end{cases} \Rightarrow s^e - m=k_3pq$$

The verified signature $s^e$ is congruent with $m$ modulo $p$ or $q$. This means that $s^e - m$ is a multiple of $p$ and also a multiple of $q$. Which makes it a multiple of $n = pq$ and thus, $gcd(c^e - m, n) = n$.

Now imagine an error occurred on one or more bits during the calculation of $s_p$. We can observe something interesting :

$$\begin{cases}s^e \not\equiv m \mod p\\s^e \equiv m \mod q\end{cases}$$ $$\begin{cases}s^e - m\not\equiv 0 \mod p\\s^e - m \equiv 0 \mod q\end{cases}$$ $$\begin{cases}s^e - m\neq k_1p \\s^e - m = k_2q\end{cases} \Rightarrow s^e - m=k_3q$$

The verified signature $s^e$ is congruent with $m$ modulo $q$ but not $p$. This means that $s^e - m$ is a multiple of $q$ but not a multiple of $p$. Thus, $gcd(c^e - m, n) = q$.

This still holds if the fault occurred during the calculation of $s_q$.

This attack can only work if the attacker knows the plaintext $m$. This also means that no random padding is applied to the message before signing.

To counter this, the oracle providing the signatures should verify that they are correct before returning them to the client.

Example

Public key :

n = 95580702933509662811256129990158655210667121276245053843875590334281563078868202152845967187641817281520364662600110239110410372520340630639373679599982371620736610194814723749147422221945978800055101110346161945811520158431287139909125886966214800526831490560384144156085296004816333892025839072729987354233
e = 65537

The faulty signing oracle :

def sign(m, fault=False):
    # calculates the chinese remainder theorem
    def chinese_remainder(rests, modulus):
        from functools import reduce
        import gmpy2

        somme = 0
        prod = reduce(lambda a, b: a * b, modulus)
        for n_i, a_i in zip(modulus, rests):
            p = prod // n_i
            somme += a_i * gmpy2.invert(p, n_i) * p
        return int(somme % prod)

    p = 8275629468590614667884614599278593237258686111405345888268221129814081809682982742676180514534238891248302334619164139839173447495925780801832743975865311
    q = 11549659551128693002436136866773711894878948768217375488402486000729420418015407109031312764052545848546107463625253637067023356033871483957407205015515303
    d = 57028735322169588856806818301191294912838800116956991324855687761743930315287292808379020323456337353850356583329296590932364569581501742220294322190489492992620765832487790259098937609311778492158901396624684965933853351733692591081803839662026408989685392901358933693441460040324163121861582078844550186253
    dp = d%(p-1)
    dq = d%(q-1)
    sp = pow(m, dp, p)
    sq = pow(m, dq, q)
    if fault:
        sq += 1
    s = chinese_remainder([sp, sq], [p, q])
    return s
# choose a random plain text m
m = 1144062044873283369712584268016323072193574476507782486309670943958849941854399472458994589746495512277165641747356772963562804680891066322320937664179951311479733128512661066412624186426754952033049367311462347474835736278522943337011258766432477000856451551683475740137481467325450641265196463953873078111
sign(m) # normal signature without faults
# 48826985473506942429348825619193527048629734669181322592483292523132023904581030994948425823318125226258131999442356309995646838788180009251813528869344879150322977876831414188780557106363189817128883449429033827496020109623428084253152324191039943271424694994370469425175289337590930279996852887250146794419
sign(m, True) # faulty signature
# 81223362319896371239841653902911360675773590325559890740901029276412364400437137170775545707395510981182542610557390791827061449907959170362042190965970358905594174189820676645727085090713100254192329083437540099792321809556592051052868235975634208506437252243437261546756934453212754787883663263102660104405
pow(sign(m), e, n) == m # Good signature should return m when verified
# True
pow(sign(m, True), e, n) == m
# False
gmpy2.gcd(pow(sign(m), e, n)-m, n) == n # Good signature doesn't leak a factor
# True
gmpy2.gcd(pow(sign(m, True), e, n)-m, n) == n
# False
gmpy2.gcd(pow(sign(m, True), e, n)-m, n) == p # Because sq is faulty, p is leaked
# True

Twin primes
#

You already know that choosing primes that are too close to each other is bad because we can efficiently factor $n$ using Fermat’s factorization algorithm. But what if we choose them even closer ?

Twin primes are primes bigger then 2 that are the closest to each other :

$q = p+2$

The minimum distance between two primes is 2, because primes bigger than 2 must be odd. The only real consecutive primes are 2 and 3.

11 and 13 are twin primes for instance.

The original Fermat’s factorization algorithm becomes unusable in the case of RSA using large twin primes.

In the case of RSA, we’ll have :

$n = p*q$

$n = p(p+2)$

$n = p^2 + 2p$

$0 = p^2 + 2p - n$

This is a simple quadratic equation.

Let’s first calculate the discriminant :

$\triangle = b^2 - 4ac = 2^2 - 4*1*(-n) = 4+4n$

$n > 0$, so $\triangle > 0$, there are always 2 solutions to this equation :

\[s_1 = \frac{-b + \sqrt{\triangle}}{2a} = \frac{-2 + \sqrt{4 + 4n}}{2} = \frac{-2 + \sqrt{4(1 + n)}}{2} = \frac{-2 + 2\sqrt{1 + n}}{2} = \sqrt{n+1} - 1\]

and

\[s_2 = \frac{-b - \sqrt{\triangle}}{2a} = \frac{-2 - \sqrt{4 + 4n}}{2} = \frac{-2 - \sqrt{4(1 + n)}}{2} = \frac{-2 - 2\sqrt{1 + n}}{2} = -\sqrt{n+1} - 1\]

If we take the absolute value of both solutions we obtain the two prime factors of $n$.

Examples

n = 11*13
gmpy2.sqrt(n+1) - 1
# mpfr('11.0')
-gmpy2.sqrt(n+1) - 1
# mpfr('-13.0')
n = 881*883
abs(gmpy2.sqrt(n+1)-1)
# mpfr('881.0')
abs(-gmpy2.sqrt(n+1)-1)
# mpfr('883.0')

Conclusion
#

I hope this was an easy coming back to the topic. It’s much simpler math than in the previous parts but after such a long break it’s good to get back to basics first 🙂

In the next part we’ll try to move on to more complex attacks.

Attacking-RSA - This article is part of a series.
Part 4: This Article