[Gimp-developer] [PATCH 1/4] Tile caching performance patches
Christopher Montgomery
xiphmont at gmail.com
Tue Jun 2 19:44:32 PDT 2009
In fact, with -O2, gcc is generating more complex assembly for % than
&, though not an integer division. Assembly generated for version
using &:
tile_data_pointer:
.LFB29:
movzwl 8(%rdi), %eax
andl $63, %edx
andl $63, %esi
imull %eax, %edx
movzbl 7(%rdi), %eax
addl %esi, %edx
imull %eax, %edx
movslq %edx,%rax
addq 24(%rdi), %rax
ret
assembly generated for version using %:
tile_data_pointer:
.LFB29:
movl %edx, %eax
sarl $31, %eax
shrl $26, %eax
addl %eax, %edx
andl $63, %edx
subl %eax, %edx
movzwl 8(%rdi), %eax
imull %eax, %edx
movl %esi, %eax
sarl $31, %eax
shrl $26, %eax
addl %eax, %esi
andl $63, %esi
subl %eax, %esi
movzbl 7(%rdi), %eax
addl %esi, %edx
imull %eax, %edx
movslq %edx,%rax
addq 24(%rdi), %rax
ret
On Tue, Jun 2, 2009 at 5:09 PM, Sven Neumann <sven at gimp.org> wrote:
> Hi,
>
> On Tue, 2009-06-02 at 16:56 -0400, Christopher Montgomery wrote:
>
>> > As far as I know pretty much any compiler out there should be able to
>> > replace a modulo by a power-of-2 constant by the bit-wise AND operation
>> > without us explicitly doing so (see also
>> > http://en.wikipedia.org/wiki/Modulo_operation#Performance_issues). So
>> > for the benefit of readable code I suggest that we keep the code as it
>> > is.
>>
>> Interesting. I got a noticable and repeatable performance benefit.
>> Which is not to say I haven't somehow mismeasured it. I agree the
>> modulo is more readable.
>>
>> ...perhaps the difference is the difference of (x) or (y) possibly
>> being negative and additional conformance-related assembly getting
>> generated? I suppose there's no reason to speculate, I'll go read the
>> assembly gcc generates and that will answer everything, at least for
>> me.
>
> I might very well be wrong here. If there's indeed a difference in the
> generated assembly and a noticeable performance benefit, than let's use
> the optimized macro. But perhaps we can add a short comment there
> explaining that ((y) & (TILE_HEIGHT-1)) is equivalent to ((y) %
> TILE_HEIGHT). Not everyone reading this code will be aware of this
> immediately.
>
>
> Sven
>
>
>
More information about the Gimp-developer
mailing list