mood: Speed up int_sqrt().
Following a recent discussion on lkml, the choice of the initial
value for the square root is sub-optimal. The change introduced
in this commit was proposed by Peter Zijlstra who also measured a
significant speed improvement for both the hot and the cold cache case.
The speed improvements for the hot-cache case were confirmed on a
32 bit system by running a simple test program which calculates the
square root of
10000000 random numbers. With the new initial value,
the running time went down by 23%. This matters because when a new
mood is loaded, int_sqrt() is called four times per admissible file.
The new initial value is computed in terms of the position of the
most significant bit set in the given argument to int_sqrt(). While
ffs(3) (find first set bit) is in POSIX.1‐2008, there is no fls(3)
(find last set bit), so we have to introduce our own implementation.
We chose an open-coded version because this turned out to be faster
than reversing the bits and calling ffs(3).