[Gegl-developer] Cache strategy
jcupitt at gmail.com
jcupitt at gmail.com
Tue Jun 16 03:11:19 PDT 2009
>> Another thing worth mentioning is that caches on every node doesn't
>> scale well to concurrent evaluation of the graph since the evaluators
>> would need to all the time synchronize usage of the caches, preventing
>> nice scaling of performance as you use more CPU cores/CPUs.
>
> In most instances, this would only incur synchronization of a few
> tiles where the chunks/work regions overlaps. Unless you are stupid
> and compute with chunk-size~=tile-size the impact of this should be
> mostly neglible.
You would still need a lock on the cache wouldn't you? For example, if
the cache is held as a GHashTable of tiles, even if individual tiles
are disjoint and not shared, you'll still need to lock the hash table
before you can search it. A couple of locks on every tile on every
node will hurt SMP scaling badly.
For what it's worth, vips handles this by keeping caches
thread-private. If threads are working in disjoint areas of the image,
there's no benefit to a shared cache anyway. As you say, there will be
a degree of recomputation for some operations (eg. convolution), but
that's a small cost compared to lock/unlock.
vips has two types of cache: a very small (just 1 or 2 tiles)
thread-private cache on every image, and a large and complex shared
cache operation that can be explicitly added to the graph. The GUI
adds a cache operator automatically just after every operation that
can output to the display, but you can use it elsewhere if you wish.
John
More information about the Gegl-developer
mailing list