Fast XOR in Python



As I [have hinted at before](/2017/09/20/just-some-friendly-advice/), the [PyCrypto library](https://www.dlitz.net/software/pycrypto/) [seems to be dead](https://github.com/dlitz/pycrypto/issues/173). The [PyCryptodome](https://www.pycryptodome.org/en/latest/) library is a fork that is promising because it is maintained and works in Python 3, but they have a bit of a finger-wagging attitude which sometimes means that you have to fight the library a bit: ``` >>> from Crypto.Cipher import ARC4 >>> cipher = ARC4.new(B'funk') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Python37\lib\site-packages\Crypto\Cipher\ARC4.py", line 132, in new return ARC4Cipher(key, *args, **kwargs) File "C:\Python37\lib\site-packages\Crypto\Cipher\ARC4.py", line 57, in __init__ len(key)) ValueError: Incorrect ARC4 key length (4 bytes) >>> ARC4.key_size = range(1,257) >>> ARC4.new(B'funk').decrypt( ARC4.new(B'funk').encrypt( B'Hello World' )) b'Hello World' ``` They certainly mean well, but the library is no place to impose security standards, in my opinion. In malware research for example, we often have to verbatim copy the appalling use of certain ciphers, like ARC4 with a 4-byte key. It happens all the time! I have been particularly struggling with [the removal of the XOR cipher](https://pycryptodome.readthedocs.io/en/latest/src/vs_pycrypto.html). The XOR implementation of PyCrypto was very fast, and in this article I will both benchmark how fast exactly it was and give you a drop-in replacement which degrades gracefully based on your options. <span id="more-4643"></span> ### The Drop In Replacement Because this may be what most people care for, here it is: ``` from itertools import cycle, islice try: from Crypto.Cipher import XOR as PyCrypto_XOR_Factory except ImportError: PyCrypto_XOR_Factory = None try: from Crypto.Util.strxor import strxor def _xor(key, data): return strxor(data, bytes(bytearray(islice(cycle(key), len(data))))) except ImportError: def _xor(key, data): k = cycle(bytearray(key)) return bytes(bytearray(b^next(k) for b in bytearray(data))) class XOR_Cipher: def __init__(self, key): self.key = key def encrypt(self, ciphertext): return _xor(self.key, ciphertext) def decrypt(self, plaintext): return _xor(self.key, plaintext) class XOR: @staticmethod def new(key): try: return PyCrypto_XOR_Factory.new(key) except (AttributeError, ValueError): return XOR_Cipher(key) ``` This code gracefully falls back from PyCrypto's `XOR` to an implementation based on PyCryptodome's `strxor` and finally to my pure Python implementation. My implementation performes about 10 times worse than the original `XOR`, but I received other suggestions that managed to perform even worse. In case you're wondering, this code is Python 2 compatible. ### The Tests This was tested with Python 2, installing the packages `pycryptodomex` and `pycrypto` so both could be used at a time. I was simply unable to get this to work under Python 3, but feel very free to educate me in the comments. The hideous XOR implementations `xor_5` and `xor_6` are the most naive thing a battle-hardened C programmer could come up with, and they perform miserably. The variant `xor_4` is something you see a lot, for example it is used in the [xor_string](https://pypi.org/project/xor_string/) module. However, I had a gut feeling that `xor_3` would be slightly faster, and my tests do confirm that hypothesis. Finally, `xor_2` is my take on using the PyCryptodome `strxor` routine, which turns out to be roughly two times slower than PyCrypto's original `XOR`, which is somewhat unsurprising considering the additional Python code overhead. ``` from itertools import cycle, islice, izip from operator import xor from Crypto.Cipher import XOR from Cryptodome import Random from Cryptodome.Util import strxor import random import sys import time class timer(): def __init__(self): self.start = time.time() def __enter__(self): return self def __exit__(self, exc_type, exc_val, exc_tb): rt = time.time() - self.start sys.stdout.write('{:6.2f}'.format(rt)) sys.stdout.flush() def xor_1(data, key): return XOR.new(key).decrypt(data) def xor_2(data, key): return strxor.strxor(data, bytearray(islice(cycle(key),len(data)))) def xor_3(data, key): k = cycle(bytearray(key)) return bytearray(b^next(k) for b in bytearray(data)) def xor_4(data, key): return b''.join(chr(ord(c)^ord(k)) for c,k in izip(data, cycle(key))) def xor_5(data, key): d, n, i = [], len(key), 0 for byte in data: d.append( chr(ord(byte) ^ ord(key[i % n])) ) i += 1 return b''.join(d) def xor_6(data, key): d, n, i = b'', len(key), 0 for byte in data: d += chr(ord(byte) ^ ord(key[i % n])) i += 1 return d key = Random.get_random_bytes(23) rounds = 1 << 12 xors = [ xor_1, xor_2, xor_3, xor_4, xor_5, xor_6 ] sys.stdout.write(''.join('XOR{}'.format(x+1).rjust(6) for x in range(len(xors)))) print(' BUFFERSIZE EVALUATIONS') for n in range(8, 24, 5): buffer = Random.get_random_bytes(1 << n) for function in xors: with timer(): for k in range(rounds): function(buffer, key) sys.stdout.write(' 0x{:08X} {}\n'.format(1<<n, rounds)) rounds >>= 4 ``` Note that I _really_ wanted to test the [cxor](https://bitbucket.org/barbuza/cxor/) module because it looks to be _just_ the thing, but I gave up after about 30 minutes of trying to get it to compile.

2 Replies to “Fast XOR in Python”

  1. Oh yea, these are the results on my machine: ``` XOR1 XOR2 XOR3 XOR4 XOR5 XOR6 BUFFERSIZE EVALUATIONS 0.05 0.21 0.42 0.46 0.39 0.33 0x00000100 4096 0.02 0.03 0.24 0.37 0.48 0.59 0x00002000 256 0.03 0.05 0.41 0.75 1.16 1.15 0x00040000 16 0.07 0.12 0.85 1.74 2.11 5.63 0x00800000 1 ```

Leave a Reply

Your email address will not be published. Required fields are marked *