When you write a lot of Python that is intended to mimic arithmetic that was taken from C or assembly code, one giant advantage of Python sometimes becomes a burden: The integers are arbitrary precision, and the "built-in" wrapping of fixed-width integer types is something that has to be added manually, sometimes a truly arduous process. I recently had a thought about how this can be done in a slightly less agonizing way. This is a prototype and might require some more extension in the future, but I am pretty happy with it already. The idea is to implement a decorator that will use Python's ast module to add a bitmask to each arithmetic operation that occurs in the code of a decorated function. Here's the code:
import ast
import functools
import inspect


def masked(mask: int):
    '''
    Convert arithmetic operations that occur within the decorated function body in such a way that
    the result is reduced using the given bitmask. All additions, subtractions, multiplications,
    left shifts, and taking powers is augmented by introducing a bitwise and with the given mask.
    '''
    def decorator(function):
        code = inspect.getsource(function)
        tree = ast.parse(code, mode='exec')

        class Postprocessor(ast.NodeTransformer):
            name = None 
            def visit_BinOp(self, node: ast.BinOp):
                node = self.generic_visit(node)
                if not isinstance(node.op, (ast.Add, ast.Mult, ast.Sub, ast.LShift, ast.Pow)):
                    return node
                return ast.BinOp(node, ast.BitAnd(), ast.Constant(mask))
            def visit_FunctionDef(self, node: ast.FunctionDef):
                node = self.generic_visit(node)
                if self.name is None:
                    node.name = self.name = F'__wrapped_{node.name}'
                    for k in range(len(node.decorator_list)):
                        if node.decorator_list[k].func.id == masked.__name__:
                            del node.decorator_list[:k + 1]
                            break
                return node

        pp = Postprocessor()
        fixed = ast.fix_missing_locations(pp.visit(tree))
        eval(compile(fixed, function.__code__.co_filename, 'exec'))
        return functools.wraps(function)(eval(pp.name))

    return decorator
With this decorator, you can now write:
@masked(0xFFFF)
def test(x: int) -> int:
    return x * 0xBAAD
This will leave all code exactly as it is except for the binary operations involving addition, multiplication, subtraction, left shift, and computing powers - all of these are converted to having one additional bitwise and operation on top.


This blag post covers scanning the Ghidra virtual memory with YARA. Do you want to know more?


The official documentation on ArangoDB usage in Symfony is [literally from 2013](https://www.arangodb.com/2013/03/getting-started-with-arangodb-and-symfony-part1/) and hence targeting a pretty old Symfony version. This blag post will cover adding minimal ArangoDB support by hand. The actual database protocol and interacting will be handled by [ArangoDBClient](https://github.com/arangodb/arangodb-php). Show me what you've got!


I recently played around with scraping public Telegram channels and finally want to do something with the data. In particular, I wanted to play around with Named Entity Recognition (NER) and Sentiment Analysis on the content of the Telegram posts. I think this will probably work much better when you do it for each language separately. So before we answer the question "what's this post about?" or "how angry is the person in the post?", let's answer the question "what language is the post in?" first. And we'll do it while warming up with that hole machine learning (ML) topic. Be aware: I have no idea what I'm doing. Do you want to know more?


I was writing some Python code that uses [ctypes][] for interfacing with the [Windows ToolHelp API](https://docs.microsoft.com/en-us/windows/win32/api/_toolhelp/). Specifically, I had to [define a Python equivalent][structs] for the [PROCESSENTRY32](https://docs.microsoft.com/en-us/windows/win32/api/tlhelp32/ns-tlhelp32-processentry32) struct. Now, the [ctypes][] [struct definitions][structs] are a little annoying because they do not give you type hints. You can add the type hints _on top_ of the _fields_ attribute but that looks a little silly because you _literally_ [write everything twice][wet]. In Python 3.7+ ((Starting in Python 3.7, dictionaries keep the insertion order.)), you can solve this using a metaclass, and one stigma of metaclasses is that they have very few applications, so I decided to blog about this one:
class FieldsFromTypeHints(type(ctypes.Structure)):
    def __new__(cls, name, bases, namespace):
        from typing import get_type_hints
        class AnnotationDummy:
            __annotations__ = namespace.get('__annotations__', {})
        annotations = get_type_hints(AnnotationDummy)
        namespace['_fields_'] = list(annotations.items())
        return type(ctypes.Structure).__new__(cls, name, bases, namespace)
and now you can write:
class PROCESSENTRY32(ctypes.Structure, metaclass=FieldsFromTypeHints):
    dwSize              : ctypes.c_uint32
    cntUsage            : ctypes.c_uint32
    th32ProcessID       : ctypes.c_uint32
    th32DefaultHeapID   : ctypes.POINTER(ctypes.c_ulong)
    th32ModuleID        : ctypes.c_uint32
    cntThreads          : ctypes.c_uint32
    th32ParentProcessID : ctypes.c_uint32
    pcPriClassBase      : ctypes.c_long
    dwFlags             : ctypes.c_uint32
    szExeFile           : ctypes.c_char * 260
The metaclass simply deduces the _fields_ attribute of the new class for you before the class is _even created_. It might make sense to think of metaclasses as "class creation hooks", i.e. you get to modify what you have written before the class is actually being defined. The following bit is just to be compatible with [PEP-563](https://www.python.org/dev/peps/pep-0563/):
        from typing import get_type_hints
        class AnnotationDummy:
            __annotations__ = namespace.get('__annotations__', {})
        annotations = get_type_hints(AnnotationDummy)
And then, the following is the actual magic line:
        namespace['_fields_'] = list(annotations.items())
This adds the attribute _fields_ before the class is created. [ctypes]: https://docs.python.org/3/library/ctypes.html [structs]: https://docs.python.org/3/library/ctypes.html#structures-and-unions [wet]: https://en.wikipedia.org/wiki/Don%27t_repeat_yourself Do you want to know what I was coding?


I started learning JUnit 4 and encountered the message "1 of 4 branches missed." when testing a code fragment like x && y where x and y depend on each other in a particular way. Here is my journey and a "solution". You want to know more? !


I often require a wrapping integer type in Python, by which I actually mean a subclass of int where all operations are performed modulo some constant number $N$. There are two main use cases for this: 1. Working in a finite field for some cryptographic stuff, or solving problems on [Project Euler](https://projecteuler.net/about). 2. Having Python integers behave like machine registers (8, 16, 32, 64, even 128 bits - you name it.) I decided to solve this once and for all and wrote the integer wrapper class to end all integer wrapper classes. I also managed to keep it rather compact by using *»gasp«* metaclassing. Do you want to know more?


As I [have hinted at before](/2017/09/20/just-some-friendly-advice/), the [PyCrypto library](https://www.dlitz.net/software/pycrypto/) [seems to be dead](https://github.com/dlitz/pycrypto/issues/173). The [PyCryptodome](https://www.pycryptodome.org/en/latest/) library is a fork that is promising because it is maintained and works in Python 3, but they have a bit of a finger-wagging attitude which sometimes means that you have to fight the library a bit:
>>> from Crypto.Cipher import ARC4
>>> cipher = ARC4.new(B'funk')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python37\lib\site-packages\Crypto\Cipher\ARC4.py", line 132, in new
    return ARC4Cipher(key, *args, **kwargs)
  File "C:\Python37\lib\site-packages\Crypto\Cipher\ARC4.py", line 57, in __init__
    len(key))
ValueError: Incorrect ARC4 key length (4 bytes)
>>> ARC4.key_size = range(1,257)
>>> ARC4.new(B'funk').decrypt( ARC4.new(B'funk').encrypt( B'Hello World' ))
b'Hello World'
They certainly mean well, but the library is no place to impose security standards, in my opinion. In malware research for example, we often have to verbatim copy the appalling use of certain ciphers, like ARC4 with a 4-byte key. It happens all the time! I have been particularly struggling with [the removal of the XOR cipher](https://pycryptodome.readthedocs.io/en/latest/src/vs_pycrypto.html). The XOR implementation of PyCrypto was very fast, and in this article I will both benchmark how fast exactly it was and give you a drop-in replacement which degrades gracefully based on your options. Do you want to know more?


So you develop in [Microsoft Visual Studio Community Edition](https://www.visualstudio.com/de/vs/community/) and you long for the old days when there was a way to get the [MSDN Library](https://msdn.microsoft.com/en-us/library/) as an offline help file? Fear not, you still can. Open Visual Studio, type Ctrl+Q to open the quick access bar, usually located in the upper right corner of your interface. Enter Help Viewer, it should yield one result by that name, marked as an *"individual component"*. Selecting that entry should allow you to download and install the Help Viewer. Now relaunch Visual Studio and start the Help Viewer via quick access in the same way. You will be prompted whether you want to download some *content* - and I bet you do.


Spoiler: My main point in this post is not given away by the title. But first things first: What are all those words? Would you like to know more?


Part of me wants to write about all the [horror](https://www.bleepingcomputer.com/news/security/ten-malicious-libraries-found-on-pypi-python-package-index/) and [glory](https://crates.io) there is to be seen in package management, but quite frankly it'll take too long. Instead, I will just leave you with a tiny piece of advice. Here comes. If you are on Windows and you want to install a *legitimate* Python package (like for example [PyCryptodome](https://pypi.python.org/pypi/pycryptodome), because naturally you are **fully aware** that [PyCrypto is dead](https://github.com/dlitz/pycrypto/issues/173).), which in reality is a bottomless pit, at the center of which there is a C library, straight from hell - then maybe get the [Microsoft Compiler for Python](http://www.microsoft.com/en-us/download/details.aspx?id=44266) instead of, who knows, wasting hours or even days looking for a less reasonable solution. Credit, as so very often, [goes to stackoverflow](https://stackoverflow.com/a/27327236/1578458).


Github has a history of not giving a frack what their users want ((https://github.com/dear-github/dear-github)) ((https://github.com/isaacs/github/issues)). For example, a few developer friends of mine were reluctant to click any links in their notifications-page, since after they clicked the link, the notification was marked as read and you might lose track of it, if you just close the browser tab ((on a side note, tellmewhenitcloses.com is pretty handy to avoid too many notifications in the first place)). So David Badura ((https://github.com/DavidBadura)) and I decided to fix this problem by writing a browser extension. The result can be found on Github: https://github.com/larsborn/GithubToDos. After installation (also possible in Opera btw, the best browser there is), the extension injects an "Add ToDo" button on every issue page and pull request. When clicking, it, the URL gets saved to the local storage of the browser ((Using local storage is handy for people that are not very concerned about privacy and just use the cloud synchronization feature of their browser: the content of their ToDo list will then also just be synced to all their devices.)). The list of all URLs added like this can then be access through a new button in the header toolbar of github. You can clone the project from github and add it as an "Unpacked extension" or just head to the Github ToDos on the Chrome Store and just install it from there. Pull requests are welcome, open an issue, if you find a bug, open source yadayada.


Obviously, my Bank does not provide a REST API to download the transactions happening on my accounts. After I asked for "machine parseable" data, they told me that I can download CSV files. Awesome! So I wrote a parser and lived happily ever after. Except that they change their CSV format without notice every few months and at some point they started to mix different encodings in the same file. So I lived unhapply, regularly fixing the script reading the CSV file. What did not change for around 10 years now are their banking statements. And this also holds for the PDFs you have to download, if you want to avoid getting them via snail mail (and paying for the postage of course). I decided to parse the PDFs instead and this went pretty well for may years... up to recently, when something changed (it may have been a software update of the parser I use or something on their side). Do you want to know more?