code – Page 2 – nullteilerfrei

Denied Permissions: copying text

born code, technology 2017-06-14

Obviously, my Bank does not provide a REST API to download the transactions happening on my accounts. After I asked for "machine parseable" data, they told me that I can download CSV files. Awesome! So I wrote a parser and lived happily ever after. Except that they change their CSV format without notice every few months and at some point they started to mix different encodings in the same file. So I lived unhapply, regularly fixing the script reading the CSV file. What did not change for around 10 years now are their banking statements. And this also holds for the PDFs you have to download, if you want to avoid getting them via snail mail (and paying for the postage of course). I decided to parse the PDFs instead and this went pretty well for may years... up to recently, when something changed (it may have been a software update of the parser I use or something on their side). Do you want to know more?

Code Rust in Windows

rattle code, technology 2017-02-22

I have started to [learn](http://rustbyexample.com/) [rust](http://rust-lang.org), and I am enjoying myself. This is a merge and update on the two fantastic blog posts on [how to setup Visual Studio Code for Rust](https://mobiarch.wordpress.com/2015/06/16/rust-using-visual-studio-code/) and [how to enable debugging](https://sherryummen.in/2016/09/02/debugging-rust-on-windows-using-visual-studio-code/). In my personal opinion, from among [all the available IDE solutions for rust](https://areweideyet.com/), this is the best. Do you want to know more?

Readable Git Annex Unused

rattle code, technology 2016-07-28

So unless you know and use [git annex](https://git-annex.branchable.com/), this is not going to be very useful for you. Check it out, though. It's pretty cool. Unless you are on Windows. In that case it's hell. Anyway, I wrote a script to help me figure out the output of [git annex unused](http://manpages.ubuntu.com/manpages/wily/man1/git-annex-unused.1.html). In short, it tells you what those files used to be called before you lost them. Script:

#!/usr/local/bin/python3.5
import re, sys, os
from subprocess import Popen,PIPE
FNULL = open(os.devnull, 'w')

def seeker(s):
  process = Popen(["git", "log", "--stat", "-S", s], stderr=FNULL, stdout=PIPE)
  log = process.stdout.read().decode("utf-8")
  match = re.search(r"(([ -~]*\/)*[ -~]*)\|", log)
  if not match: return ''
  else: return match.group(1).strip()

if len(sys.argv)>1:
  print( seeker(sys.argv[1]) )
else:
  clist = []
  while True:
    crawl = input().strip()
    if not crawl: break
    crawl = crawl.split()
    clist.append(crawl)
  for x in clist:
    print('%s : %s' % ( x[0], seeker(x[1]) ) )

You can either call it with one argument which should be a key, or if you call it with no argument, it expects you to paste the list of results you got from git annex unused into stdin. It then goes through the list and tells you the corresponding filename for each key. This script is phenomenally stupid in that it does quite a terrible regular expression search on the output of git log and returns the first match it finds. Sue me, it works pretty well at my end.

Binary Determinantal Complexity

rattle code, mathematics 2014-11-11

My good friend and colleague Christian Ikenmeyer and I wrote this cute preprint about polynomials and how they can be written as the determinant of a matrix with entries equal to zero, one and indeterminantes. Go ahead and read it if you know even just a little math, it's quite straightforward. The algorithm described in section 3 has been implemented and you can download the code from my website at the TU Berlin. Compilation instructions are in ptest.c, but you will need to get nauty to perform the entire computerized proof.

Switching the switching of Keys!

rattle code, technology 2014-11-03

I recently lamented about switching two keys on my new Lenovo Yoga. Big problem: In my office, I attach that notebook to a docking station and to that docking station I attach a keyboard. On that keyboard, all keys are precisely the way I want them to be. Therefore, I do not want to switch the Insert and End keys when I am docked. I ended up writing a little batch script based on this nice google code wiki entry for the registry update and this stackexchange answer to elevate the batch script:

@ECHO OFF
NET FILE 1>NUL 2>NUL
if '%ERRORLEVEL%' == '0' goto run 
powershell "saps -filepath %0 -verb runas" >nul 2>&1
goto eof
:run
REG QUERY "HKLM\SYSTEM\CurrentControlSet\Control\Keyboard Layout" ^
 /v "Scancode Map" >nul 2>&1 
IF '%ERRORLEVEL%' == '0' goto remove
<nul set /p ="> adding scancode map "
REG ADD "HKLM\SYSTEM\CurrentControlSet\Control\Keyboard Layout" ^
 /v "Scancode Map" /t REG_BINARY /f ^
 /d 00000000000000000300000052E04FE04FE052E000000000 >nul 2>&1 
IF '%ERRORLEVEL%' == '0' goto success
goto fail 
:remove
<nul set /p ="> removing scancode map "
REG DELETE "HKLM\SYSTEM\CurrentControlSet\Control\Keyboard Layout" ^
 /v "Scancode Map" /f >nul 2>&1 
IF '%ERRORLEVEL%' == '0' goto success
goto fail
:fail 
echo failed.
pause
goto eof
:success
echo succeeded.
pause

Sadly, it always requires a reboot for the changes to take effect.

Bit Twiddling Hacks

rattle code, mathematics 2014-10-04

I recently implemented an algorithm that has to perform checks on all subsets of some large set. A subset of an $n$-sized set can be understood as a binary string of length $n$ where bit $i$ is set if and only if the $i$-th element is in the subset. During my search for code to enumerate such bitstrings, I found the greatest page in the entire internet. If anyone can explain to me how computing the next bit permutation (the last version) works, please do.

Simple WordPress Update Script

rattle code, technology 2014-08-07

I need to update this wordpress install every once in a while. There are lots of bash scripts on the internet that perform this task, and they are complicated beyond reason. This is what I use:

function cfg {  
    grep $2 $1/wp-config.php | awk 'BEGIN {FS="[, )\x27]*"}; {print $3;}'
}

echo "> backing up database."
mysqldump --user=$(cfg $1 DB_USER) \
          --password=$(cfg $1 DB_PASSWORD)  \
          --host=$(cfg $1 DB_HOST)          \
          $(cfg $1 DB_NAME) > backup.database.sql

echo "> backing up website."
tar -cjf backup.files.bz2 $1
    
echo "> retrieving latest wordpress."
wget -q https://wordpress.org/latest.zip
unzip -qq latest.zip

echo "> updating wordpress."
rm -r $1/wp-includes $1/wp-admin
cp -r wordpress/* $1/

echo "> cleaning up."
rm -r wordpress
rm latest.zip

It takes a single argument, which is the name of your wordpress root directory. It backups your database to the file backup.database.sql and backups the files to backup.files.bz2, then it simply proceeds as described in the wordpress codex for updating manual. I do not see what all the fuzz is about.

Member Variables in Python

rattle code 2014-07-27

Member variables in python are horrible. They are not visible in the layout of the class which is instantiated, but instead the __init__ function of a class creates certain member variables for the instance. I have never liked this about python, to be honest. For a recent project, I devised the following solution. Assume you would want this behaviour:

>>> class test(Base):
...     # Variables
...     number = 4
...     string = "hodor"
...     # Functions
...     def stringmult(self):
...         return self.number * self.string
...
>>> test().stringmult()
'hodorhodorhodorhodor'
>>> test(number=2).stringmult()
'hodorhodor'
>>> test(string="Na",number=8).stringmult() + " - Batman!"
'NaNaNaNaNaNaNaNa - Batman!'
>>> 
>>> test(end="Batman!")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ctypes.ArgumentError
>>>

In other words, any class that inherits from Base can be constructed with keyword arguments who must match exactly the correct class variables which you specify. This one does it:

from ctypes import ArgumentError
class Base(object):
    def __init__(self, **kwargs):
        # "given" is the list of keyword arguments passed to the constructor
        # of this object, "needed" is the list of class variables which belong to  
        # the base class of the object which is being created, which do not end 
        # with two underscores and which are not a function. Trust me, we do not 
        # want to meddle with those.
        given = list(kwargs.keys())
        needed = [attr for attr in dir(self.__class__) if attr[-2:] != '__' \
             and type(self.__class__.__dict__[attr])!=type(lambda:0) ]

        # Check if keyword arguments have been provided which are not among the
        # required arguments and throw an exception if so. Remove this check for
        # a less restrictive base class. I wouldn't recommend it.
        if not set(given) <= set(needed):
            raise ArgumentError()

        # First, initialize the attribute dictionary of the object being created
        # with a list of default values, indicated by the values of the class 
        # variables. Then, update the attribute dictionary again with the values
        # provided to this constructor.
        self.__dict__.update({k: self.__class__.__dict__[k] for k in needed})
        self.__dict__.update(kwargs)

I personally like this approach a lot and hereby dare you to tell me even a single reason not to do this, in the comments.

Why someone thought that sudoku might not be boring, while actually you should learn how to properly implement backtracking

rattle code, technology 2014-07-03

It causes me unspeakable agony to see that my post about why sudoku is boring is one of the most frequented posts in this blog, mostly because most of my readers clearly disagree with the title. I recently received an email titled "why sudoku is not all that boring" by an old friend, and he taunted me that the sudoku

S = [
  0, 0, 0, 0, 6, 0, 0, 8, 0,
  0, 2, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 1, 0, 0, 0, 0, 0, 0,
  0, 7, 0, 0, 0, 0, 1, 0, 2,
  5, 0, 0, 0, 3, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 4, 0, 0,
  0, 0, 4, 2, 0, 1, 0, 0, 0,
  3, 0, 0, 7, 0, 0, 6, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 5, 0 ]

would take my 5-minute hack of a backtracking algorithm

real    51m3.656s
user    50m32.260s
sys     0m2.084s

to solve. So, it seems like some sudokus are really hard, even for a computer, right? Wrong wrong wrong wrong wrong. Read how to implement backtracking properly.

Python to Batch!

rattle code 2014-01-13

Nikolai asked me how to open a Python interpreter prompt from the console with a certain module already imported. For the record, the Python command line switches tell us that python -i -m module is the way to start the prompt and load module.py. That made me wonder whether I can stuff it all into one batch file, and I came up with the following test.bat:

REM = '''
@COPY %0.bat %0.py
@python %0.py
@DEL %0.py 
@goto :eof ::'''
del REM
for k in range(70):
    print(k)

That script will ignore the first line because it is a comment, then copy itself to test.py, then launch python with this argument. Afterwards, test.py is deleted and the script terminates without looking at any of the following lines. Note that :: is yet another way to comment in Batch. Python, however, will see a script where the variable REM is defined as a multi-line string and deleted right after that. After this little stub, you can put any python code you want. Well. I thought it was funky.

eMail obfuscation from several directions

rattle code, technology 2013-11-08

It would be beneficial to have your eMail-Address on your homepage when you want to give people the opportunity to contact you. In my case, I actually want to give people that opportunity ((That does not go without saying: In some cases it is very important to discourage people from contacting you directly. For instance, it could be that you are usually not the right person to ask, but you get flooded with requests that you have to manually redirect elsewhere: It would be better if people got discouraged from contacting you while finding it very easy to contact the person that can actually help them.)). However, we all know that there are villains out there who harvest emails from websites in order to sell them to spammers. Hence, you would want to obfuscate your eMail-Address in such a way that it cannot be read from the HTML source code of the website easily. It goes without saying that naive obfuscation techniques like mymail(at)domain(dot)com are beaten more easily by machines than by people, and the more creative you get with your substitutions, the easier the machines can do it and the less easy your customers can. Images are the end of that line of thought, the point where a user has no chance at all other than to copy your eMail address letter by letter from an image which could still theoretically be OCR'd by a decent program. Therefore, Javascript methods to encode and on-the-fly decode the eMail address ((Famous enodings are ROT13, Base64, etcetera.)) seemed like the best choice to me for a long while. It is harder to run a Javascript interpreter than simple regular expressions for the email harvester, but to a user there is no difference: The browser already comes equipped with Javascript. At this point, some people get cranky that everything should work also without Javascript, and I can understand that. However, the main reason for me to abandon the Javascript method was simple necessity: I have to publish my eMail on the homepage of a university, which is managed from within a complex CMS, and the system would not allow me to use Javascript. This makes a lot of sense, you don't want every user of a large webportal to be able to stick whatever Javascript they like into the page, that stuff can make everything go to hell. This blog post is one among many advertising the capability of unicode to display letters from right to left. However, seriously, how difficult do you think it is for an email harvester to check for reversed email addresses as well? I would suppose it's more or less just a matter of checking for the reversed regular expression, too. Easy prey, if you ask me. But I had an idea! Do you want to know what that idea was?

Building Perl stand-alone applications with a GUI using Mojolicious and PAR-Packer

tobi code, technology 2013-05-17

You might have come across the same problem I have faced pretty often: You want to write a small snippet of code for a friend who's not into programming to solve some task. You want to use the scripting language of your choice (yeah, [Perl](http://www.perl.org/)). But for many people, especially Windows users, explaining them how to [install perl](http://www.perl.org/get.html), install some modules from [CPAN](http://cpan.org), and finally how to use the script from the command line is tedious and often takes more time than writing it in the first place. And sometimes it even takes more time than solving the task by hand which is quite frustrating. So I always wanted to build stand-alone applications with a GUI for those cases. But building GUIs is usually a huge pain in the ass, so I always avoided it; until I got the idea to build web applications with [Mojolicious](http://mojolicio.us/) as GUI. Building stand alone executables without the need of installing perl, modules, or libraries can be solved with [PAR-Packer](http://search.cpan.org/dist/PAR-Packer/). So far, that was just a thought. A few days ago I got a small task: My brother wanted an application to automatically correct one kind of systemic error in large data sets. So I wanted to put that idea to the test. It worked out quite well! Do you want to know more?

My Quest for (CPU) Power

rattle code, technology 2013-05-11

This glab post is essentially about running a certain shell command remotely on multiple systems within a network that has been set up for public-key authentication. It's a standard task for experienced system administrators I am sure, but for me it was a little adventure in bash scripting — and I wanted to share it with you. Do you want to know more?