It causes me unspeakable agony to see that my post about why sudoku is boring is one of the most frequented posts in this blog, mostly because most of my readers clearly disagree with the title. I recently received an email titled "why sudoku is not all that boring" by an old friend, and he taunted me that the sudoku
S = [
  0, 0, 0, 0, 6, 0, 0, 8, 0,
  0, 2, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 1, 0, 0, 0, 0, 0, 0,
  0, 7, 0, 0, 0, 0, 1, 0, 2,
  5, 0, 0, 0, 3, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 4, 0, 0,
  0, 0, 4, 2, 0, 1, 0, 0, 0,
  3, 0, 0, 7, 0, 0, 6, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 5, 0 ]
would take my 5-minute hack of a backtracking algorithm
real    51m3.656s
user    50m32.260s
sys     0m2.084s
to solve. So, it seems like some sudokus are really hard, even for a computer, right? Wrong wrong wrong wrong wrong. Read how to implement backtracking properly.


Nikolai asked me how to open a Python interpreter prompt from the console with a certain module already imported. For the record, the Python command line switches tell us that python -i -m module is the way to start the prompt and load module.py. That made me wonder whether I can stuff it all into one batch file, and I came up with the following test.bat:
REM = '''
@COPY %0.bat %0.py
@python %0.py
@DEL %0.py 
@goto :eof ::'''
del REM
for k in range(70):
    print(k)
That script will ignore the first line because it is a comment, then copy itself to test.py, then launch python with this argument. Afterwards, test.py is deleted and the script terminates without looking at any of the following lines. Note that :: is yet another way to comment in Batch. Python, however, will see a script where the variable REM is defined as a multi-line string and deleted right after that. After this little stub, you can put any python code you want. Well. I thought it was funky.


It would be beneficial to have your eMail-Address on your homepage when you want to give people the opportunity to contact you. In my case, I actually want to give people that opportunity[1]. However, we all know that there are villains out there who harvest emails from websites in order to sell them to spammers. Hence, you would want to obfuscate your eMail-Address in such a way that it cannot be read from the HTML source code of the website easily. It goes without saying that naive obfuscation techniques like mymail(at)domain(dot)com are beaten more easily by machines than by people, and the more creative you get with your substitutions, the easier the machines can do it and the less easy your customers can. Images are the end of that line of thought, the point where a user has no chance at all other than to copy your eMail address letter by letter from an image which could still theoretically be OCR'd by a decent program. Therefore, Javascript methods to encode and on-the-fly decode the eMail address[2] seemed like the best choice to me for a long while. It is harder to run a Javascript interpreter than simple regular expressions for the email harvester, but to a user there is no difference: The browser already comes equipped with Javascript. At this point, some people get cranky that everything should work also without Javascript, and I can understand that. However, the main reason for me to abandon the Javascript method was simple necessity: I have to publish my eMail on the homepage of a university, which is managed from within a complex CMS, and the system would not allow me to use Javascript. This makes a lot of sense, you don't want every user of a large webportal to be able to stick whatever Javascript they like into the page, that stuff can make everything go to hell. This blog post is one among many advertising the capability of unicode to display letters from right to left. However, seriously, how difficult do you think it is for an email harvester to check for reversed email addresses as well? I would suppose it's more or less just a matter of checking for the reversed regular expression, too. Easy prey, if you ask me. But I had an idea! Do you want to know what that idea was?
  1. That does not go without saying: In some cases it is very important to discourage people from contacting you directly. For instance, it could be that you are usually not the right person to ask, but you get flooded with requests that you have to manually redirect elsewhere: It would be better if people got discouraged from contacting you while finding it very easy to contact the person that can actually help them. [back]
  2. Famous enodings are ROT13, Base64, etcetera. [back]


You might have come across the same problem I have faced pretty often: You want to write a small snippet of code for a friend who's not into programming to solve some task. You want to use the scripting language of your choice (yeah, [Perl](http://www.perl.org/)). But for many people, especially Windows users, explaining them how to [install perl](http://www.perl.org/get.html), install some modules from [CPAN](http://cpan.org), and finally how to use the script from the command line is tedious and often takes more time than writing it in the first place. And sometimes it even takes more time than solving the task by hand which is quite frustrating. So I always wanted to build stand-alone applications with a GUI for those cases. But building GUIs is usually a huge pain in the ass, so I always avoided it; until I got the idea to build web applications with [Mojolicious](http://mojolicio.us/) as GUI. Building stand alone executables without the need of installing perl, modules, or libraries can be solved with [PAR-Packer](http://search.cpan.org/dist/PAR-Packer/). So far, that was just a thought. A few days ago I got a small task: My brother wanted an application to automatically correct one kind of systemic error in large data sets. So I wanted to put that idea to the test. It worked out quite well! Do you want to know more?


This glab post is essentially about running a certain shell command remotely on multiple systems within a network that has been set up for public-key authentication. It's a standard task for experienced system administrators I am sure, but for me it was a little adventure in bash scripting — and I wanted to share it with you. Do you want to know more?


I was riding the backseat of a car, a pal of mine with a large Sudoku book on the seat beside me. I glared over at him and remarked that I find Sudokus utterly boring and would feel that my time is wasted on any of them. He looked up at me, clearly demanding an explanation for that statement. I continued to explain that a computer program could solve a Sudoku with such ease that there is no need for humans to do it. He replied that something similar could be said about chess, but still it's an interesting game. And it was then, that I realized why Sudoku is so horribly boring, and chess is not. It was the fact that I could code a Sudoku solver and solve the Sudoku he was puzzling about, and I would be able to do it faster than it would take him to solve the entire thing by hand. This does not apply to chess, evidently. Of course, I confidently explained this to him. »Prove it.«, he said. So I did. Do you want to know more?


Harm Derksen brachte uns schon im Jahre 1998 die weltschnellste Methode zur Berechnung irreduzibler Charaktere der symmetrischen Gruppe. Aber in seinem preprint, Computing with Characters of the Symmetric Group, bleiben einige Fragen offen, die ich hiermit zu klären versuche.
» The code to end all codes «

Die Welt ist einen halben Herzschlag alt. Der unsterbliche Prophet meditiert seit Anbeginn der Laufzeit auf dem তাজমহল und studiert die heiligen 12 Zeilen C-Code, die jede SAT Instanz in $\mathcal{O}(n^{9699690})$ lösen. Wenn der Puls eine Ganzzahl wird, so wird er erwachen und die Wahrheit verkünden, und Lob wird gepriesen, und Licht wird ewig scheinen auf die Kinder des Schaltkreises. Denn er ist der Taktgeber, und an der Spitze des Signals wird es sein, wenn er uns erscheint. Buch Arithmæl Circulæ, Vers 21:7 Wollen Sie mehr wissen?



Let's say you have a Windows PC on a 64 bit machine. Obviously, you have Cygwin because anything else would just be silly. Then you want to develop C-Applications that you can compile and run on large Linux-based servers, so you turn to an an IDE that can work with Cygwin. Now here comes the twist: For whatever stubborn reason, you want to be able to compile your applications native 64 bit, and you want to do it in Eclipse, and you want to be able to debug them. Do you want to know how?


Here's my code for quickly uploading files to virustotal and retrieving the reports.
import postfile
import sys

import json
from StringIO import StringIO

import urllib
import urllib2

import time

import webbrowser

apikey = 'YOUR API KEY ' + \
         '  GOES HERE  '
resources = []
for i in range(1, len(sys.argv)):
    file = sys.argv[i]
    print 'Preparing Scan of %s ...' % file

    host = 'www.virustotal.com'
    selector = 'https://www.virustotal.com/vtapi/v2/file/scan'
    fields = [('apikey', apikey)]
    file_to_send = open(file, 'rb').read()
    files = [('file', file, file_to_send)]

    print 'Uploading file...'
    ret = postfile.post_multipart(host, selector, fields, files)
    try:
        data = json.loads(ret)
    except ValueError:
        print 'Cannot decode server response: '
        print ret
        exit()
    print 'Upload done.'

    # for k in data: print '%s: %s' % (k, data[k])

    resources.append((file, data['resource']))

print 'Retreiving reports...'
i = 1
permalinks = []
for resource in resources:
    response_code = 0
    while response_code == 0:
        url = 'https://www.virustotal.com/vtapi/v2/file/report'
        parameters = {
            'resource': resource[1],
            'apikey': apikey
        }
        data = urllib.urlencode(parameters)
        req = urllib2.Request(url, data)
        response = urllib2.urlopen(req)
        ret = response.read()
        data = json.loads(ret)
        response_code = data['response_code']
        #print json.dumps(data, sort_keys=True, indent=4)
        if response_code == 0: time.sleep(5)
    #print json.dumps(data, sort_keys=True, indent=4)
    permalinks.append(data['permalink'])
    print '%2i: %s' % (i, resource[0]), 
    print ': %i / %i' % (data['positives'], data['total'])
    i += 1

wb = webbrowser.get()
selection = 0
while selection >= 0 and selection < len(permalinks):
    selection = int(raw_input('Open: '))-1
    if selection >= 0 and selection < len(permalinks):
        wb.open(permalinks[selection])
P.S.: This is all part of a great plan I'm following at the moment. Edit (2013-09-05): Since the VirusTotal API is now out there for a while, a lot of awesome python libraries have emerged: * https://github.com/Erethon/vta.py * https://github.com/Gawen/virustotal * https://github.com/botherder/virustotal * https://github.com/Xen0ph0n/VirusTotal_API_Tool


Under Cygwin, you can install the 64 bit mingw version of GCC, but you don't get the gnu multiprecision library for free with it, you'll much rather have to compile it from source. I ran into a bit of trouble here: It will not suffice to tell the configuration script about the new compiler, there are now mingw-64 versions of all relevant binaries that should be used instead. Basically, you go like
tar -xjf gmp-5.0.4.tar.bz2
cd gmp-5.0.4
./configure                          \
  AR=x86_64-w64-mingw32-ar           \
  AS=x86_64-w64-mingw32-as           \
  DLLTOOL=x86_64-w64-mingw32-dlltool \
  DLLWRAP=x86_64-w64-mingw32-dllwrap \
  CC=x86_64-w64-mingw32-gcc-4.5.3    \
  NM=x86_64-w64-mingw32-nm           \
  LD=x86_64-w64-mingw32-ld           \
  OBJDUMP=x86_64-w64-mingw32-objdump \
  RANLIB=x86_64-w64-mingw32-ranlib   \
  STRIP=x86_64-w64-mingw32-strip
I am not sure if all of these are needed, but it won't hurt either. After that, you should
make && make check
the whole thing. Worked perfectly for me, so now I can link with libgmp.a in .libs and native 64 bit bignum action ensues!


Just a checklist for programming the microncontroller Ralf gave me yesterday[1] * you need a functional eclipse installation * Install WinAVR and be sure that it is in the PATH by executing  avr-gcc from a command prompt * Install the AVR-Interface into eclipse via HelpInstall New SoftwareAdd. Then paste the following link in the URL Box:
http://avr-eclipse.sourceforge.net/updatesite/
* When I create a new project, I need to adjust the following settings in the project properties: * AVRAVRDude Programmer click on New, Name: USBASP, Programmer Hardware (-c) USBasp, http://www.fischl.de/usbasp/ * AVRAVRDude, Advance be sure the check Disable device signature check * AVR → Target Hardware MCU Type select ATmega168 and set the MCU Clock Frequency to 12000000 (that's $12\cdot10^6$) * C/C++-Build → Settings unter Tool Settings check Generate HEX file for Flash memory * C/C++-Build → Settings unter Tool Settings, AVR CompilerOptimization set the Optimization Level to Size Optimizations (-Os).
  1. Also check out this useful article [back]


After a couple of tests, it turns out that the very simple
#include <limits.h>

inline void memzap(void *dest, unsigned long count) {
    asm( "cld"
#   if ULONG_MAX == 0xffffffff
    "\n" "andl $3, %%ecx"
    "\n" "rep  stosb"
    "\n" "movl %%ebx, %%ecx"
    "\n" "shrl $2, %%ecx"
    "\n" "rep  stosl"
#   else
    "\n" "andq $7, %%rcx"
    "\n" "rep  stosb"
    "\n" "movq %%rbx, %%rcx"
    "\n" "shrq $3, %%rcx"
    "\n" "rep  stosq"
#   endif
      : "=c" (count), "=D" (dest), "=b" (count)
      :  "c" (count),  "D" (dest),  "b" (count), "a" (0)
    );
}
is the fastest way to zero out a large block of memory, which is not very surprising. It is about 4 to 5 times faster than memset and about as fast as new [], if I can trust @tobi on that matter. I tried using MMX registers, but anything that involves actually looping over the memory region will be about as fast as memset. The only way to get a bit of speed is using the rep opcode. Tiny Edit: The above code is much more safe to compile on both 64 and 32 bit computers.


For reasons I have not yet been able to figure out, @tobi is making me implement a couple of very rudimentary routines in x86 GCC inline assembler because he wants them faster than possible for mere mortal C. The first was a routine to calculate $\lfloor\log_2(n)\rfloor$ for $n\in\mathbb{N}$ and the second one was to zero out a large block of memory. For instance,
unsigned inline log2int(unsigned x) {
    unsigned l;
    asm("bsrl %1, %0" : "=r" (l) : "r" (x));
    return ( 1 << l == x ) ? l : l + 1;
}
is about 50 times faster than the C-native Version
unsigned inline log2int(unsigned x) {
   unsigned l = 0;
   while(x > (1<<l)) l++;
   return l;
}
even after optimization. For some reason, I found it tricky to google up the official intel x86 opcode reference[1][2], so I am linking these here.
  1. Opcode Reference Part 1 [back]
  2. Opcode Reference Part 2 [back]