So unless you know and use [git annex](, this is not going to be very useful for you. Check it out, though. It's pretty cool. Unless you are on Windows. In that case it's hell. Anyway, I wrote a script to help me figure out the output of [git annex unused]( In short, it tells you what those files used to be called before you lost them. Script:
import re, sys, os
from subprocess import Popen,PIPE
FNULL = open(os.devnull, 'w')

def seeker(s):
  process = Popen(["git", "log", "--stat", "-S", s], stderr=FNULL, stdout=PIPE)
  log ="utf-8")
  match ="(([ -~]*\/)*[ -~]*)\|", log)
  if not match: return ''
  else: return

if len(sys.argv)>1:
  print( seeker(sys.argv[1]) )
  clist = []
  while True:
    crawl = input().strip()
    if not crawl: break
    crawl = crawl.split()
  for x in clist:
    print('%s : %s' % ( x[0], seeker(x[1]) ) )
You can either call it with one argument which should be a key, or if you call it with no argument, it expects you to paste the list of results you got from git annex unused into stdin. It then goes through the list and tells you the corresponding filename for each key. This script is phenomenally stupid in that it does quite a terrible regular expression search on the output of git log and returns the first match it finds. Sue me, it works pretty well at my end.

My good friend and colleague Christian Ikenmeyer and I wrote this cute preprint about polynomials and how they can be written as the determinant of a matrix with entries equal to zero, one and indeterminantes. Go ahead and read it if you know even just a little math, it's quite straightforward. The algorithm described in section 3 has been implemented and you can download the code from my website at the TU Berlin. Compilation instructions are in ptest.c, but you will need to get nauty to perform the entire computerized proof.

I recently lamented about switching two keys on my new Lenovo Yoga. Big problem: In my office, I attach that notebook to a docking station and to that docking station I attach a keyboard. On that keyboard, all keys are precisely the way I want them to be. Therefore, I do not want to switch the Insert and End keys when I am docked. I ended up writing a little batch script based on this nice google code wiki entry for the registry update and this stackexchange answer to elevate the batch script:
if '%ERRORLEVEL%' == '0' goto run 
powershell "saps -filepath %0 -verb runas" >nul 2>&1
goto eof
REG QUERY "HKLM\SYSTEM\CurrentControlSet\Control\Keyboard Layout" ^
 /v "Scancode Map" >nul 2>&1 
IF '%ERRORLEVEL%' == '0' goto remove
<nul set /p ="> adding scancode map "
REG ADD "HKLM\SYSTEM\CurrentControlSet\Control\Keyboard Layout" ^
 /v "Scancode Map" /t REG_BINARY /f ^
 /d 00000000000000000300000052E04FE04FE052E000000000 >nul 2>&1 
IF '%ERRORLEVEL%' == '0' goto success
goto fail 
<nul set /p ="> removing scancode map "
REG DELETE "HKLM\SYSTEM\CurrentControlSet\Control\Keyboard Layout" ^
 /v "Scancode Map" /f >nul 2>&1 
IF '%ERRORLEVEL%' == '0' goto success
goto fail
echo failed.
goto eof
echo succeeded.
Sadly, it always requires a reboot for the changes to take effect.

I recently implemented an algorithm that has to perform checks on all subsets of some large set. A subset of an $n$-sized set can be understood as a binary string of length $n$ where bit $i$ is set if and only if the $i$-th element is in the subset. During my search for code to enumerate such bitstrings, I found the greatest page in the entire internet. If anyone can explain to me how computing the next bit permutation (the last version) works, please do.

I need to update this wordpress install every once in a while. There are lots of bash scripts on the internet that perform this task, and they are complicated beyond reason. This is what I use:
function cfg {  
    grep $2 $1/wp-config.php | awk 'BEGIN {FS="[, )\x27]*"}; {print $3;}'

echo "> backing up database."
mysqldump --user=$(cfg $1 DB_USER) \
          --password=$(cfg $1 DB_PASSWORD)  \
          --host=$(cfg $1 DB_HOST)          \
          $(cfg $1 DB_NAME) > backup.database.sql

echo "> backing up website."
tar -cjf backup.files.bz2 $1
echo "> retrieving latest wordpress."
wget -q
unzip -qq

echo "> updating wordpress."
rm -r $1/wp-includes $1/wp-admin
cp -r wordpress/* $1/

echo "> cleaning up."
rm -r wordpress
It takes a single argument, which is the name of your wordpress root directory. It backups your database to the file backup.database.sql and backups the files to backup.files.bz2, then it simply proceeds as described in the wordpress codex for updating manual. I do not see what all the fuzz is about.

Member variables in python are horrible. They are not visible in the layout of the class which is instantiated, but instead the __init__ function of a class creates certain member variables for the instance. I have never liked this about python, to be honest. For a recent project, I devised the following solution. Assume you would want this behaviour:
>>> class test(Base):
...     # Variables
...     number = 4
...     string = "hodor"
...     # Functions
...     def stringmult(self):
...         return self.number * self.string
>>> test().stringmult()
>>> test(number=2).stringmult()
>>> test(string="Na",number=8).stringmult() + " - Batman!"
'NaNaNaNaNaNaNaNa - Batman!'
>>> test(end="Batman!")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
In other words, any class that inherits from Base can be constructed with keyword arguments who must match exactly the correct class variables which you specify. This one does it:
from ctypes import ArgumentError
class Base(object):
    def __init__(self, **kwargs):
        # "given" is the list of keyword arguments passed to the constructor
        # of this object, "needed" is the list of class variables which belong to  
        # the base class of the object which is being created, which do not end 
        # with two underscores and which are not a function. Trust me, we do not 
        # want to meddle with those.
        given = list(kwargs.keys())
        needed = [attr for attr in dir(self.__class__) if attr[-2:] != '__' \
             and type(self.__class__.__dict__[attr])!=type(lambda:0) ]

        # Check if keyword arguments have been provided which are not among the
        # required arguments and throw an exception if so. Remove this check for
        # a less restrictive base class. I wouldn't recommend it.
        if not set(given) <= set(needed):
            raise ArgumentError()

        # First, initialize the attribute dictionary of the object being created
        # with a list of default values, indicated by the values of the class 
        # variables. Then, update the attribute dictionary again with the values
        # provided to this constructor.
        self.__dict__.update({k: self.__class__.__dict__[k] for k in needed})
I personally like this approach a lot and hereby dare you to tell me even a single reason not to do this, in the comments.

It causes me unspeakable agony to see that my post about why sudoku is boring is one of the most frequented posts in this blog, mostly because most of my readers clearly disagree with the title. I recently received an email titled "why sudoku is not all that boring" by an old friend, and he taunted me that the sudoku
S = [
  0, 0, 0, 0, 6, 0, 0, 8, 0,
  0, 2, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 1, 0, 0, 0, 0, 0, 0,
  0, 7, 0, 0, 0, 0, 1, 0, 2,
  5, 0, 0, 0, 3, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 4, 0, 0,
  0, 0, 4, 2, 0, 1, 0, 0, 0,
  3, 0, 0, 7, 0, 0, 6, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 5, 0 ]
would take my 5-minute hack of a backtracking algorithm
real    51m3.656s
user    50m32.260s
sys     0m2.084s
to solve. So, it seems like some sudokus are really hard, even for a computer, right? Wrong wrong wrong wrong wrong. Read how to implement backtracking properly.

Nikolai asked me how to open a Python interpreter prompt from the console with a certain module already imported. For the record, the Python command line switches tell us that python -i -m module is the way to start the prompt and load That made me wonder whether I can stuff it all into one batch file, and I came up with the following test.bat:
REM = '''
@COPY %0.bat
@goto :eof ::'''
del REM
for k in range(70):
That script will ignore the first line because it is a comment, then copy itself to, then launch python with this argument. Afterwards, is deleted and the script terminates without looking at any of the following lines. Note that :: is yet another way to comment in Batch. Python, however, will see a script where the variable REM is defined as a multi-line string and deleted right after that. After this little stub, you can put any python code you want. Well. I thought it was funky.

It would be beneficial to have your eMail-Address on your homepage when you want to give people the opportunity to contact you. In my case, I actually want to give people that opportunity ((That does not go without saying: In some cases it is very important to discourage people from contacting you directly. For instance, it could be that you are usually not the right person to ask, but you get flooded with requests that you have to manually redirect elsewhere: It would be better if people got discouraged from contacting you while finding it very easy to contact the person that can actually help them.)). However, we all know that there are villains out there who harvest emails from websites in order to sell them to spammers. Hence, you would want to obfuscate your eMail-Address in such a way that it cannot be read from the HTML source code of the website easily. It goes without saying that naive obfuscation techniques like mymail(at)domain(dot)com are beaten more easily by machines than by people, and the more creative you get with your substitutions, the easier the machines can do it and the less easy your customers can. Images are the end of that line of thought, the point where a user has no chance at all other than to copy your eMail address letter by letter from an image which could still theoretically be OCR'd by a decent program. Therefore, Javascript methods to encode and on-the-fly decode the eMail address ((Famous enodings are ROT13, Base64, etcetera.)) seemed like the best choice to me for a long while. It is harder to run a Javascript interpreter than simple regular expressions for the email harvester, but to a user there is no difference: The browser already comes equipped with Javascript. At this point, some people get cranky that everything should work also without Javascript, and I can understand that. However, the main reason for me to abandon the Javascript method was simple necessity: I have to publish my eMail on the homepage of a university, which is managed from within a complex CMS, and the system would not allow me to use Javascript. This makes a lot of sense, you don't want every user of a large webportal to be able to stick whatever Javascript they like into the page, that stuff can make everything go to hell. This blog post is one among many advertising the capability of unicode to display letters from right to left. However, seriously, how difficult do you think it is for an email harvester to check for reversed email addresses as well? I would suppose it's more or less just a matter of checking for the reversed regular expression, too. Easy prey, if you ask me. But I had an idea! Do you want to know what that idea was?

You might have come across the same problem I have faced pretty often: You want to write a small snippet of code for a friend who's not into programming to solve some task. You want to use the scripting language of your choice (yeah, [Perl]( But for many people, especially Windows users, explaining them how to [install perl](, install some modules from [CPAN](, and finally how to use the script from the command line is tedious and often takes more time than writing it in the first place. And sometimes it even takes more time than solving the task by hand which is quite frustrating. So I always wanted to build stand-alone applications with a GUI for those cases. But building GUIs is usually a huge pain in the ass, so I always avoided it; until I got the idea to build web applications with [Mojolicious]( as GUI. Building stand alone executables without the need of installing perl, modules, or libraries can be solved with [PAR-Packer]( So far, that was just a thought. A few days ago I got a small task: My brother wanted an application to automatically correct one kind of systemic error in large data sets. So I wanted to put that idea to the test. It worked out quite well! Do you want to know more?

This glab post is essentially about running a certain shell command remotely on multiple systems within a network that has been set up for public-key authentication. It's a standard task for experienced system administrators I am sure, but for me it was a little adventure in bash scripting — and I wanted to share it with you. Do you want to know more?

I was riding the backseat of a car, a pal of mine with a large Sudoku book on the seat beside me. I glared over at him and remarked that I find Sudokus utterly boring and would feel that my time is wasted on any of them. He looked up at me, clearly demanding an explanation for that statement. I continued to explain that a computer program could solve a Sudoku with such ease that there is no need for humans to do it. He replied that something similar could be said about chess, but still it's an interesting game. And it was then, that I realized why Sudoku is so horribly boring, and chess is not. It was the fact that I could code a Sudoku solver and solve the Sudoku he was puzzling about, and I would be able to do it faster than it would take him to solve the entire thing by hand. This does not apply to chess, evidently. Of course, I confidently explained this to him. »Prove it.«, he said. So I did. Do you want to know more?

Harm Derksen brachte uns schon im Jahre 1998 die weltschnellste Methode zur Berechnung irreduzibler Charaktere der symmetrischen Gruppe. Aber in seinem preprint, Computing with Characters of the Symmetric Group, bleiben einige Fragen offen, die ich hiermit zu klären versuche.
» The code to end all codes «

Die Welt ist einen halben Herzschlag alt. Der unsterbliche Prophet meditiert seit Anbeginn der Laufzeit auf dem তাজমহল und studiert die heiligen 12 Zeilen C-Code, die jede SAT Instanz in $\mathcal{O}(n^{9699690})$ lösen. Wenn der Puls eine Ganzzahl wird, so wird er erwachen und die Wahrheit verkünden, und Lob wird gepriesen, und Licht wird ewig scheinen auf die Kinder des Schaltkreises. Denn er ist der Taktgeber, und an der Spitze des Signals wird es sein, wenn er uns erscheint. Buch Arithmæl Circulæ, Vers 21:7 Wollen Sie mehr wissen?