Posts

Attacking RSA for fun and CTF points – part 4

It’s been a long time for both of us since part 3 of this series. So to quietly resume our journey in the beautiful world of mathematics I propose you 4 rather simple topics :

  • Multi-prime RSA
  • Brute force attack on small secret CRT-Exponents
  • Fault attack on signatures
  • Twin primes

Multi-prime RSA

In part 1 we saw the basics of RSA and how to generate a key pair using 2 prime numbers, but did you know that it’s theoretically possible to use more than 2 primes to generate a key pair ?

By constructing a key using k primes, you would have :

    \[n = \prod_{i=1}^k p_i$\]

    \[\varphi(n) = \prod_{i=1}^k (p_i-1)\]

For the rest, it’s the same as when using 2 primes.

Example

>>> n = 13*17*89*101
>>> e = 23
>>> phi = 12*16*88*100
>>> d = gmpy2.invert(e,phi)
>>> d
587687
>>> m = 31337
>>> c = pow(m,e,n)
>>> c
612133
>>> m == pow(c, d, n)
True

You might wonder, why would anyone want to do that ? Wouldn’t it make it less secure because the primes would be smaller for the same key size ?

While it’s correct that the factors will be smaller if you use more primes for a given key size, it’s not a major security issue as long as they don’t become too small. If the individual primes can be recovered using known factorization algorithm like ECM, then it’s an obvious bad idea. But otherwise, as long as the total size of the modulus is large enough, factorization is not a threat. The more primes you want to use for your key pair, the larger the modulus should be.

Using more than 2 primes offers performance improvements thanks to the Chinese Remainder Theorem (CRT). The speedup is about k^2 with k the number of factors used.

Brute force attack on small secret CRT-Exponents

We’ve seen that the CRT can be used to speedup RSA encryption and decryption. In this case we are interested in the decryption process.

In order to decrypt RSA using the CRT, you have to compute :

    \[\begin{cases}d_p \equiv e^{-1} [p-1]\\d_q \equiv e^{-1} [q-1]\end{cases}\]

If the value of e is poorly chosen it’s possible that d_p or d_q is so small that you can find it with pure brute force.

How do we know if we have found the correct value for d_p or d_q though ? And what will we use it for ?

From the definition of d_p:

d_p * e \equiv 1 [p-1]

d_p * e = 1 + k*(p-1)

Given any number m with gcd(m, p) = 1 :

m^{e*d_p} = m^{1 + k*(p-1)} = m*m^{(p-1)^k}

From Fermat’s little theorem :

m*m^{(p-1)^k} \equiv m * 1^k \equiv m [p]

So we have :

m^{e*d_p} \equiv m [p]

Or rewritten without the modulo :

m - m^{e*d_p} = k*p

As an attacker, you know e and n and you guess d_p. To verify your guess, you can choose any number m < n and compute :

gcd(m - m^{e*d_p}, n) = ?

For a correct value of d_p, this is the same as :

gcd(k*p, p*q) = p

If your guess is incorrect you should get 1 or n.

As you can see, correctly guessing d_p leaks a factor of n.

This is the same for d_q.

Example

>>> n=95580702933509662811256129990158655210667121276245053843875590334281563078868202152845967187641817281520364662600110239110410372520340630639373679599982371620736610194814723749147422221945978800055101110346161945811520158431287139909125886966214800526831490560384144156085296004816333892025839072729987354233
>>> e=1817084480271067137841898198122075168542117135135738925285694555698012943264936112861815937200507849960517390660821911331068907250788900674614345400567411
>>> m = 7516789928765 # random number
>>> for dp in range(1000000):
...    f = gmpy2.gcd(m-pow(m, e*dp, n), n)
...    if f > 1:
...        print(dp, f)
...        break
...
187261 8275629468590614667884614599278593237258686111405345888268221129814081809682982742676180514534238891248302334619164139839173447495925780801832743975865311
>>> p = 8275629468590614667884614599278593237258686111405345888268221129814081809682982742676180514534238891248302334619164139839173447495925780801832743975865311
>>> q = n // p
>>> p*q == n
True

Fault attack on signatures

While we are talking about attacks on RSA CRT, let’s focus on the signing process. This process is identical to decryption but instead of applying it on a cipher text, you apply it on the plain text m :

    \[\begin{cases}s_p = m^{d_p} + k_1p\\ s_q = m^{d_q} + k_2q\end{cases} \Rightarrow s = m^d + k_3pq\]

Partial signatures s_p and s_q are calculated modulo p and q respectively, then the CRT is used to combine both and get the final signature s modulo n.

In the beautiful world of mathematics everything goes well every time but that’s sadly not the case in the physical world in which our computers do the maths. Faults can occur during operations.

Without faults during the calculation of the partial signatures s_p and s_q, we can do the following observations when verifying the final signature s :

    \[\begin{cases}s^e \equiv m [p]\\s^e \equiv m [q]\end{cases} \Rightarrow \begin{cases}s^e - m\equiv 0 [p]\\s^e - m \equiv 0 [q]\end{cases} \Rightarrow \begin{cases}s^e - m= k_1p \\s^e - m =  k_2q\end{cases} \Rightarrow s^e - m=k_3pq\]

The verified signature s^e is congruent with m modulo p or q. This means that s^e - m is a multiple of p and also a multiple of q. Which makes it a multiple of n = pq and thus, gcd(c^e - m, n) = n

Now imagine an error occurred on one or more bits during the calculation of s_p. We can observe something interesting :

    \[\begin{cases}s^e \not\equiv m [p]\\s^e \equiv m [q]\end{cases} \Rightarrow \begin{cases}s^e - m\not\equiv 0 [p]\\s^e - m \equiv 0 [q]\end{cases} \Rightarrow \begin{cases}s^e - m\neq k_1p \\s^e - m =  k_2q\end{cases} \Rightarrow s^e - m=k_3q\]

The verified signature s^e is congruent with m modulo q but not p. This means that s^e - m is a multiple of q but not a multiple of p. Thus, gcd(c^e - m, n) = q

This still holds if the fault occurred during the calculation of s_q

This attack can only work if the attacker knows the plain text m. This also means that no random padding is applied to the message before signing.

To counter this, the oracle providing the signatures should verify that they are correct before returning them to the client.

Example

Public key :

n = 95580702933509662811256129990158655210667121276245053843875590334281563078868202152845967187641817281520364662600110239110410372520340630639373679599982371620736610194814723749147422221945978800055101110346161945811520158431287139909125886966214800526831490560384144156085296004816333892025839072729987354233
e = 65537

The faulty signing oracle :

def sign(m, fault=False):
    # calculates the chinese remainder theorem
    def chinese_remainder(rests, modulus):
        from functools import reduce
        import gmpy2

        somme = 0
        prod = reduce(lambda a, b: a * b, modulus)
        for n_i, a_i in zip(modulus, rests):
            p = prod // n_i
            somme += a_i * gmpy2.invert(p, n_i) * p
        return int(somme % prod)
    
    p = 8275629468590614667884614599278593237258686111405345888268221129814081809682982742676180514534238891248302334619164139839173447495925780801832743975865311
    q = 11549659551128693002436136866773711894878948768217375488402486000729420418015407109031312764052545848546107463625253637067023356033871483957407205015515303
    d = 57028735322169588856806818301191294912838800116956991324855687761743930315287292808379020323456337353850356583329296590932364569581501742220294322190489492992620765832487790259098937609311778492158901396624684965933853351733692591081803839662026408989685392901358933693441460040324163121861582078844550186253
    dp = d%(p-1)
    dq = d%(q-1)
    sp = pow(m, dp, p)
    sq = pow(m, dq, q)
    if fault:
        sq += 1
    s = chinese_remainder([sp, sq], [p, q])
    return s
>>> # choose a random plain text m
>>> m = 1144062044873283369712584268016323072193574476507782486309670943958849941854399472458994589746495512277165641747356772963562804680891066322320937664179951311479733128512661066412624186426754952033049367311462347474835736278522943337011258766432477000856451551683475740137481467325450641265196463953873078111
>>> sign(m) # normal signature without faults
48826985473506942429348825619193527048629734669181322592483292523132023904581030994948425823318125226258131999442356309995646838788180009251813528869344879150322977876831414188780557106363189817128883449429033827496020109623428084253152324191039943271424694994370469425175289337590930279996852887250146794419
>>> sign(m, True) # faulty signature
81223362319896371239841653902911360675773590325559890740901029276412364400437137170775545707395510981182542610557390791827061449907959170362042190965970358905594174189820676645727085090713100254192329083437540099792321809556592051052868235975634208506437252243437261546756934453212754787883663263102660104405
>>> pow(sign(m), e, n) == m # Good signature should return m when verified
True
>>> pow(sign(m, True), e, n) == m
False
>>> gmpy2.gcd(pow(sign(m), e, n)-m, n) == n # Good signature doesn't leak a factor
True
>>> gmpy2.gcd(pow(sign(m, True), e, n)-m, n) == n
False
>>> gmpy2.gcd(pow(sign(m, True), e, n)-m, n) == p # Because sq is faulty, p is leaked
True

Twin primes

You already know that choosing primes that are too close to each other is bad because we can efficiently factor n using Fermat’s factorization algorithm. But what if we choose them even closer ?

Twin primes are primes bigger then 2 that are the closest to each other :

q = p+2

The minimum distance between two primes is 2, because primes bigger than 2 must be odd. The only real consecutive primes are 2 and 3.

11 and 13 are twin primes for instance.

The original Fermat’s factorization algorithm becomes unusable in the case of RSA using large twin primes.

In the case of RSA, we’ll have :

n = p*q

n = p(p+2)

n = p^2 + 2p

0 = p^2 + 2p - n

This is a simple quadratic equation.

Let’s first calculate the discriminant :

\triangle = b^2 - 4ac = 2^2 - 4*1*(-n) = 4+4n

n > 0, so \triangle > 0, there are always 2 solutions to this equation :

    \[s_1 = \frac{-b + \sqrt{\triangle}}{2a} = \frac{-2 + \sqrt{4 + 4n}}{2} = \frac{-2 + \sqrt{4(1 + n)}}{2} = \frac{-2 + 2\sqrt{1 + n}}{2} = \sqrt{n+1} - 1\]

and

    \[s_2 = \frac{-b - \sqrt{\triangle}}{2a} = \frac{-2 - \sqrt{4 + 4n}}{2} = \frac{-2 - \sqrt{4(1 + n)}}{2} = \frac{-2 - 2\sqrt{1 + n}}{2} = -\sqrt{n+1} - 1\]

If we take the absolute value of both solutions we obtain the two prime factors of n.

Examples

>>> n = 11*13
>>> gmpy2.sqrt(n+1) - 1
mpfr('11.0')
>>> -gmpy2.sqrt(n+1) - 1
mpfr('-13.0')
>>> n = 881*883
>>> abs(gmpy2.sqrt(n+1)-1)
mpfr('881.0')
>>> abs(-gmpy2.sqrt(n+1)-1)
mpfr('883.0')

Conclusion

I hope this was an easy coming back to the topic. It’s much simpler math than in the previous parts but after such a long break it’s good to get back to basics first 🙂

In the next part we’ll try to move on to more complex attacks.

3 thoughts on “Attacking RSA for fun and CTF points – part 4

  1. Thanks your RSA series very interesting and useful. How do you learn this solution? I mean you read a special book or something else?

    1. Thanks ! I mainly learn from CTF tasks and write-ups or by reading about RSA online (research papers, other blogs, …).

  2. Thanks for the series. Clear explanation with just the right amount of info. Hope you continue! (this series and/or other topics)

Leave a Reply

Your email address will not be published. Required fields are marked *