Skip to main content

Breaking Python's PRNG with a few values and no bruteforce

·8 mins·
Crypto
Table of Contents
StackeredSAS/python-random-playground

This repository contains code snippets and POCs associated to our article on breaking Python’s PRNG with a few values and no bruteforce.

Python
5
0

Background
#

As you may already know, it is possible to recover the Mersenne Twister seed knowing only two outputs of PHP’s mt_rand function, without any bruteforce.

Given that both Python and PHP employ the same PRNG, a natural question arises: can we apply this discovery to Python’s PRNG as well? Initially, a quick examination of Python’s implementation reveals a significantly different seeding mechanism, raising doubts about this possiblity. However, upon thorough investigation, we discovered an alternative method to leverage their findings and adapt them to Python’s PRNG.

I will only capitalize on the work presented at Stackered to illustrate how it can be applied to recover various seeds.

The random module allows seeding with different types of Python objects, representing either numbers or bytes. The minimal number of outputs required to recover the seed in each scenario is given below :

Seed TypeAdditional InfokNumber of required outputs
Int32-bit16
Int64-bit28
Int$n$-bit$\frac{n+31}{32}$$2k + 4^\text{*}$
Float28
BytesVersion 128
BytesVersion 2, $n$-chars$\frac{n+3}{4} + 16$$2(k-16) + 4^\text{*}$
NoneUsing time + PID412
NoneUsing system random624624

$^\text{*}$ At least for “small” values of k as after adding a lot of pairs more relations can be made between them, lowering the total number of required ones. For instance, for k = 624, only 624 outputs (the full state) are required.

Example seed recovery
#

All the code examples use Python 3.12 and are taken from the associated GitHub project.

Recovering a 32-bit seed
#

import random
from functions import *

SEED = 0x533D0
random.seed(SEED)
# k = 1
# K = [0x533D0]

# extract 624 outputs and compute the associated RNG state
S = [untemper(random.getrandbits(32)) for _ in range(624)]

The table indicates that we should be able to recover the seed with 6 outputs of the RNG. We will use outputs $S_0$, $S_1$, $S_2$, $S_{227}$, $S_{228}$ and $S_{229}$.

The first step is to recover part of the internal state $I$, namely $I_{228}$, $I_{229}$, and $I_{230}$ (except it’s MSB).

I_227_, I_228 = invertStep(S[0], S[227]) # MSB of I_227 and LSBs of I_228
I_228_, I_229 = invertStep(S[1], S[228]) # MSB of I_228 and LSBs of I_229
I_229_, I_230 = invertStep(S[2], S[229]) # MSB of I_229 and LSBs of I_230

# Combine MSB and LSB to get the whole value
I_228 += I_228_
I_229 += I_229_

With the knowledge of this part of the internal state, it’s now possible to recompute a single value of the seed array $K$ :

# two possibilities for I_230
seed1 = recover_Kj_from_Ii(I_230, I_229, I_228, 230)
seed2 = recover_Kj_from_Ii(I_230+0x80000000, I_229, I_228, 230)
# only the MSB differs
print(hex(seed1), hex(seed2))
# 0x533d0 0x800533d0

Because we can’t recover the MSB of $I_{230}$ with only 6 outputs, we have 2 possible seeds.

Recovering a 64-bit seed
#

import random
from functions import *

SEED = 0xDEADBEEF000533D0
random.seed(SEED)
# k = 2
# K = [0x533D0, 0xDEADBEEF]

# extract 624 outputs and compute the associated RNG state
S = [untemper(random.getrandbits(32)) for _ in range(624)]

The table indicates that we should be able to recover the seed with 8 outputs of the RNG. We will use outputs $S_0$, $S_1$, $S_2$, $S_3$, $S_{227}$, $S_{228}$, $S_{229}$ and $S_{230}$.

The first step is to recover part of the internal state $I$, namely $I_{228}$, $I_{229}$, $I_{230}$, and $I_{231}$ (except it’s MSB).

I_227_, I_228 = invertStep(S[0], S[227]) # MSB of I_227 and LSBs of I_228
I_228_, I_229 = invertStep(S[1], S[228]) # MSB of I_228 and LSBs of I_229
I_229_, I_230 = invertStep(S[2], S[229]) # MSB of I_229 and LSBs of I_230
I_230_, I_231 = invertStep(S[3], S[230]) # MSB of I_230 and LSBs of I_231

# Combine MSB and LSB to get the whole value
I_228 += I_228_
I_229 += I_229_
I_230 += I_230_

With the knowledge of this part of the internal state, it’s now possible to recompute two consecutive values of the seed array $K$ :

# K[1] + 1
seed_h = recover_Kj_from_Ii(I_230, I_229, I_228, 230) - 1
# K[0] + 0
# two possibilities for I_231
seed_l1 = recover_Kj_from_Ii(I_231, I_230, I_229, 231)
seed_l2 = recover_Kj_from_Ii(I_231+0x80000000, I_230, I_229, 231)

# reconstruct the original seed
seed1 = (seed_h << 32) + seed_l1
seed2 = (seed_h << 32) + seed_l2
# only the MSB of K[0] differs
print(hex(seed1), hex(seed2))
# 0xdeadbeef800533d0 0xdeadbeef000533d0

Because we can’t recover the MSB of $I_{231}$ with only 8 outputs, we have 2 possible seeds.

Recovering a Float seed
#

import random
from functions import *

SEED = 18726.12612
# We can't recover the original seed in this case, just the equivalent 64-bit integer
print(f"{hex(hash(SEED)) = }")
# hex(hash(SEED)) = '0x4092ccf6c004926'
random.seed(SEED)
# k = 2
# K = [0x6c004926, 0x4092ccf]

# extract 624 outputs and compute the associated RNG state
S = [untemper(random.getrandbits(32)) for _ in range(624)]

Because a float is internally converted to a 64-bit seed using python’s hash function, the process is exactly the same as before.

Recovering a bytes (V1) seed
#

import random
from functions import *

def V1(a):
    """
    Copy of Random.seed in the case of bytes and version 1.
    """
    a = a.decode('latin-1') if isinstance(a, bytes) else a
    x = ord(a[0]) << 7 if a else 0
    for c in map(ord, a):
        x = ((1000003 * x) ^ c) & 0xFFFFFFFFFFFFFFFF
    x ^= len(a)
    a = -2 if x == -1 else x
    return a

SEED = b"my seed"
# We can't recover the original seed in this case, just the equivalent 64-bit integer
print(f"{hex(V1(SEED)) = }")
random.seed(SEED, version=1)
# k = 2
# K = [0x4527321a, 0xe833f9ce]

# extract 624 outputs and compute the associated RNG state
S = [untemper(random.getrandbits(32)) for _ in range(624)]

Because a bytes (in version 1) is internally converted to a 64-bit seed using a function similar to V1, the process is exactly the same as before.

Recovering a bytes (V2) seed
#

import random
from functions import *

SEED = b"my seed"
random.seed(SEED)
# k = 18
# K = [0xc83476be, 0x9f313ec1, 0xfdb09e63, 0xf3827c68, 0x7814c985, 0xb1e9e888, 0xa924e2f8, 0x1fe1760b, 0xca754857, 0x67433568, 0x287bc567, 0xaba62218, 0xf1a54538, 0xba893a44, 0x41256723, 0xc046d021, 0x73656564, 0x6d7920]

# extract 624 outputs and compute the associated RNG state
S = [untemper(random.getrandbits(32)) for _ in range(624)]

The seed here is 7 bytes long. Let’s compute the values of the table :

n = 7
k = (n+3)//4 + 16
k
# 18
outputs = 2*(k-16)+4
outputs
# 8

We should be able to recover the seed with 8 outputs of the RNG. We will use outputs $S_3$, $S_4$, $S_5$, $S_6$, $S_{230}$, $S_{231}$, $S_{232}$ and $S_{233}$.

Warning! The indices need to be chosen carefully because the seed array $K$ does not only contain the seed, but also it’s SHA-256 hash. We need to ensure we recover the seed and not part of the hash.

The next step is to recover part of the internal state $I$, namely $I_{231}$, $I_{232}$, $I_{233}$, and $I_{234}$ (except it’s MSB).

# because the seed has only 7 chars it can be recovered using 8 carefully chosen outputs
I_230_, I_231 = invertStep(S[3], S[230])
I_231_, I_232 = invertStep(S[4], S[231])
I_232_, I_233 = invertStep(S[5], S[232])
I_233_, I_234 = invertStep(S[6], S[233])

I_231 += I_231_
I_232 += I_232_
I_233 += I_233_

With the knowledge of this part of the internal state, it’s now possible to recompute two consecutive values of the seed array $K$ :

# K[16] + 16
seed_l = recover_Kj_from_Ii(I_233, I_232, I_231, 233) - 16
# K[17] + 17
# two possibilities for I_234
seed_h1 = recover_Kj_from_Ii(I_234, I_233, I_232, 234) - 17
seed_h2 = recover_Kj_from_Ii(I_234+0x80000000, I_233, I_232, 234) - 17

seed1 = (seed_h1 << 32) + seed_l
seed2 = (seed_h2 << 32) + seed_l

# only the MSB of K[17] differs
print(bytes.fromhex(hex(seed1)[2:]))
# b'my seed'
print(bytes.fromhex(hex(seed2)[2:]))
# b'\x80my seed'

Because we can’t recover the MSB of $I_{234}$ with only 8 outputs, we have 2 possible seeds.

Recovering the default seed
#

By default the RNG is seeded using the OS’s CSPRNG by populating the seed array $K$ with 624 random 32-bit values.

import random
from functions import *

# random is not manually seeded so it uses 624 random values

# extract 624 outputs and compute the associated RNG state
S = [untemper(random.getrandbits(32)) for _ in range(624)]

The table indicates that we need 624 outputs of the RNG to be able to recover the seed. That is, we need the full state to be able to rewind back to the internal state.

I = rewindState(S)
seed_array = seedArrayFromState(I)
seed = seedArrayToInt(seed_array)

# the recovered seed is very big. Too big to be printed in decimal
# print(hex(seed))
# The recovered seed is not exactly the same, but is equivalent.
random.seed(seed)

S_ = [untemper(random.getrandbits(32)) for _ in range(624)]
assert(S_ == S)
Note: If you get an AssertionError, you probably need to reopen your Python interpreter, because the random module is still seeded from the last call to random.seed.