|
|
#1 |
|
User
|
I've been looking at how MPQs encrypt data. Most of you probably already know that if the file has an offset table [which they almost all do], you can recover the decryption key. That attack relies on knowing the first byte of the file, and doesn't work if you know, say, the 100th byte. Well, it turns out that if you know a few bytes anywhere in the file, you can still decrypt. Let's look at the decryption algorithm:
__________________Code:
//cypher (seed1, seed2, value): 1: seed2 += T[0x400 + (seed1 & 0xFF)] 2: value ^= seed1 + seed2 3: seed1 = (seed1<<21 + 0x11111111) | (seed1>>11) 4: seed2 = seed2*33 + value + 3 Pay close attention to line 3. Notice that it is the only line that changes seed1, and seed1 is the only variable on the right hand side, too. Most importantly, notice that the new value of seed1 will always have bits 1,5,11,15,19 and 23 set to true [because of that 111111 filling the space left by seed1<<21 being or'ed with something else]. The first time that line runs, seed1 loses at least 6 bits of entropy. After ~5 runs seed1 is limited to ~65 values, and cycles with period ~4 [there are a bunch of cycles]. So let's say we know 2 bytes in the middle of a file. At the first known byte we know seed1 can only take on ~65 different values, and we know encrypted_value = value xor (seed1 + seed2). This lets us recover 65 possible values for seed1,seed2. Then we apply the same thing to the next byte, and only keep the values that match, cutting us down to (almost definitely) 1 value. Now we know the cypher state and can decrypt all the following information. But wait! Half a file isn't very useful, and you're more likely to know the last 2 bytes of a file [eg. carriage return + line feed], which doesn't leave you much following data to decrypt. If we could only move backwards, we would be set. Well, good news, you can go backwards! [this is something good crypto never lets you do! ]If we want to go backwards, we need to reverse all the lines, and run them in reverse order. In line 1 you just change += to -=. Line 2 is its own inverse. Line 4 is a bit harder, it requires you to find the inverse of multiplication by 3 mod 2^32 [it's multiplication by 15748213419], but it's still reversible. The challenge is line 3. As I mentioned earlier, line 3 essentially erases some of the bits of seed1 [they become 1s]. That means it is, by definition, impossible to reverse, because the number of possible outputs is smaller than the number of possible inputs. But wait! Seed1 converges to small cycles after only ~5 iterations, and after that point we can invert the line by following the cycle backwards! That means all the lines are reversible, so we can move backwards over the file and decrypt from the end to the start. The only thing we can't decrypt are the first few bytes, because seed1 is unknown there. So our backwards decryption algorithm looks something like this: Code:
//backwards cypher (seed1, seed2, value):
4: seed2 = 15748213419*(seed2 - value - 3)
3:
x = seed1
for i = 1 to 100: //to go backwards in the cycle, you go forwards until the next value is seed1
y = (x<<21 + 0x11111111) | (x>>11)
if y == seed1:
seed1 = x
break
x = y
if i == 100: throw error("couldn't reverse!") //too close to start of file, seed1 wasn't in a cycle yet
2: value ^= seed1 + seed2
1: seed2 -= T[0x400 + (seed1 & 0xFF)]If we know any two specific bytes anywhere in the file, we can decrypt it. I'd say that's pretty broken. Last edited by Strilanc : 07-31-2008 at 12:59 AM. |
|
|
|
| Sponsored Links - Login to hide this ad! |
|
|
|
|
#2 |
|
User
Respected User
Join Date: Jul 2002
Posts: 181
![]()
|
Nice work. With that, and the assumption that decrypted files are going to be one of several file formats, you could probably write a decent automated tool for decrypting most any file. My only question -- is there any practical purpose? Aside from a few odd files that aren't used ever (in some patch executables and on the install disks... files that have unknown names) what good is this? The only real reason to put a file in a MPQ is to have one of Blizzard's games load it at some point. If the game can load it, we already have access to it. Regardless, I'm impressed. I might have to play around with this if I ever get some real free time.
|
|
|
|
|
|
#3 |
|
User
|
It's not particularly useful because almost all the files are compressed in blocks, so they have offset tables. I mainly did this as an academic thing, not a practical one.
__________________Watching what the game loads gets the majority of files, but none of the interesting unused files. |
|
|
|
|
|
#4 |
|
User
Join Date: May 2008
Posts: 239
|
Yes its useless because it only helps with single unit files which almost never come up =(
__________________Most of the time you can get an encrypted file with unknown filename because file blocks usually follow the block table, but if some tool moves the blocks away I dont think there is a way to extract such files. BTW simple analysis of the known files in the archive can give you almost all used files (except the ones used by the script in obfuscated form, like "Replaceable" + "Textures" ...) and a lot of unused files as well. |
|
|
|
|
|
#5 |
|
User
Respected User
|
well the Encryption can be used to avoid most deprotectors of working. The most famous deprotector is discontinued, so it pretty much helps map protection.
__________________Yet Encryption doesn't really do anything for an MPQ. |
|
|
|
|
|
#6 |
|
User
Join Date: May 2008
Posts: 239
|
It would stop only stupid deprotectors from working, if you really wanted to you could add/modify files in the archive without breaking files you can't decrypt, but anyway you can always get all the used files relatively easily. I hardly ever see a map that would obfuscate any filename in JASS and even those can be obtained manually if you try hard enough.
__________________But yes, I do shift around the files in maps I make because it stops a most popular MPQ editors from working. |
|
|
|
|
|
#7 |
|
User
Respected User
|
well indeed, as far as the game can read it, anyone can.
__________________ |
|
|
|
|
|
#8 |
|
User
|
I happened to look at this a bit more. I realized I could rewrite the algorithm to make it work left to right, which allowed my to easily find that every single possible key ends up in this pattern within six steps:
__________________Code:
:.Aaa.Bbb..:.Cc:.Dd:..::Ee::Ff:: becomes :.eEE.fFF..:.Aa:.Bb:..::Cc::Dd:: on is : off is . unknowns are letters, inversed by casing The modified code to update seed1 is: Code:
k = (((k rotateright 11) ^ 0xFFE00000) + 0x11000000) | 0x00111111
- or, after the first step has removed some carry possibilities in the addition -
k = k rotateright 11
k ^= 0xEEE00000
k = (k & ~0x22000000) | (~(k << 1) & 0x22000000)
k |= 0x00111111Given the pattern, it is obvious: - There is only 6 bits of entropy (instead of 32), meaning 64 possible values instead of 4 billion~ - The cycles all have length <= 6 A lot weaker than I thought. Last edited by Strilanc : 09-21-2010 at 12:02 PM. |
|
|
|
![]() |
| Thread Tools | Search this Thread |
|
|
|
Donate |