For reasons I have not yet been able to figure out, @tobi is making me implement a couple of very rudimentary routines in x86 GCC inline assembler because he wants them faster than possible for mere mortal C. The first was a routine to calculate $\lfloor\log_2(n)\rfloor$ for $n\in\mathbb{N}$ and the second one was to zero out a large block of memory. For instance,
				
			unsigned inline log2int(unsigned x) {
    unsigned l;
    asm("bsrl %1, %0" : "=r" (l) : "r" (x));
    return ( 1 << l == x ) ? l : l + 1;
}
unsigned inline log2int(unsigned x) {
   unsigned l = 0;
   while(x > (1<<l)) l++;
   return l;
}