Quantcast
Channel: Why can't one implement bcrypt in Cuda? - Cryptography Stack Exchange
Viewing all articles
Browse latest Browse all 3

Answer by Thomas Pornin for Why can't one implement bcrypt in Cuda?

$
0
0

It is not impossible, only harder. This is because of RAM. In a GPU, you have a number of cores which can do 32-bit operations. They will run at one operation per cycle and per core, as long as they operate on their respective registers. RAM access, however, is more troublesome. Each group of cores has access to a small amount of shared RAM, and all cores can read and write the GPU main RAM, but there are access restrictions: not all cores can read from or write to RAM simultaneously (constraints are stricter for main RAM).

Now bcrypt is a variant of the Blowfish key scheduling, which is defined over a table (a few kilobytes) which is constantly accessed and modified throughout the algorithm. Due to the size of the table, each core will have to store it in the GPU main RAM, and they will compete for usage of the memory bus. So bcrypt will run -- but not with full parallelism. At any time, most cores will be stalled, waiting for the memory bus to become free. This comes from the type of elementary operation bcrypt consists in, not from the fact that bcrypt is derived from the key schedule of a block cipher.

For SHA-1 or SHA-256, computation entirely consists in 32-bit operations on a handful of registers, so a password cracker will run without doing any memory access at all, and full parallelism is easily achieved (I did it on my GeForce 9800 GTX+, and I got about 98% of the theoretical maximum speed with a straightforward unrolled SHA-1 implementation).

For details on the programming model in CUDA, have a look at the CUDA C Programming Guide. Also, the author of bcrypt now proposes scrypt (edit: actually that's not the same person; the author of scrypt is Colin Percival, while bcrypt has been designed by Niels Provos and David Mazières), which is even heavier on the memory accesses, exactly so that implementation is hard on GPU and FPGA.


Viewing all articles
Browse latest Browse all 3

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>