Optimised Traversal - Confession 47

2014.12.31 16:48:27

header Usually when we talk about optimisation of data structures the concern for optimisation lies within the operations of adding, updating, deleting, and/or retrieving. I've never really heard of any algorithms or data structures that try to optimise traversal. After all, if you have n elements, how can traversal be anything but O(n)?

Well, in the case of Parasol I need something that speeds up traversal. Let me explain why: in order to support a history, the paint operations you perform each create a new drawing layer of their own. This is absolutely necessary because there is no efficient means of calculating a delta between two steps and I can't just apply the operation directly onto the canvas as the operation is done, as potential effects such as brush composition mode have to apply to the entire stroke, not separately to each ellipse that constitutes a stroke.


In the above the comparison is made using 0.5 opacity rather than a composition mode, but you get the idea. So, each stroke needs their own layer to draw on while it is being made. In that case we would end up with drawing potentially hundreds of layers each step to draw all the strokes. “That's stupid!” I hear you scream, “Just commit the stroke layer to that of the canvas and done! No performance problem anymore.” Well, yes, I was getting to that. That would be the simple solution, but now consider the problem of a history. In order to undo our last action, we would have to clear the canvas and redraw all our n-1 strokes on top again. Undoing would start to take very long very quickly, and there's no way around it because we can't simply compute the delta of a stroke (or any other change) and do a “revert” to get the previous state without having to copy the canvas that underlies the stroke for each stroke and keep that around indefinitely, or at least for as long as your history is. This doesn't seem like the most efficient way to go about it to me.

So, aside from having no caching at all or caching the deltas, the only way left would be to speed up the traversal of the strokes so that you only have to paint a part of them. This can be done in our scenario since we have the ability to group a range of strokes together into a single layer that can then be drawn instead of all the strokes on their own (this is however complicated by only being able to group together strokes of the same composition mode). Another reason why this is easily possible is because any history is implemented as a stack, so we don't have to worry about removing or adding elements at arbitrary positions, only ever at the head. Knowing this we can quickly devise a structure that will give a O(log(n)) traversal, push, and pop. If each stroke is pushed into a group on the stack, we can start to group groups of the same magnitude together into greater groups. If you draw this up, you'll quickly see that we're arriving at a logarithmic amount of actual items in the stack. Each push and pop we might have to re-balance this grouping tree, but that should also only take maximum one traversal of the base stack, meaning logarithmic steps again.

The initial idea for this didn't come from me and was actually already around during the time I was working on the first Parasol version. I asked around on #lisp and after some brainstorming someone came up with this sweet idea. I only now got around to starting to implement the necessary data structure. I really hope that this strategy is worth it, but I can't say for sure yet, as there are a couple of things that worry me: First, when grouping strokes together we have to assume that their layers are positioned relatively close together. If they happen to be far apart often, the memory performance will suffer greatly as we will have to allocate one giant layer to contain both with a lot of unused space in-between. Furthermore, this kind of grouping strategy has a couple of factors (max group size, head buffer size, total max size, whether to preemptively explode groups upon pop, etc) that could heavily influence how well it performs under different circumstances. I have no idea what's going to be the right choice to make here, so I'll have to do lots of tedious testing.

Due to the memory constraints it might actually even turn out that keeping the deltas of each stroke might turn out better in the average case. Currently I really don't know and I'm merely going by what intuition tells me. I'd be interested in knowing what kinds of strategies existing painting applications employ for this scenario, but I could not find any resources at all related to this and I'm not willing to dive into the huge sources of OSS painting applications to find out on my own.

I probably won't have anything more to talk about for another month now, as I'll have to dive into studying and put all my programming work behind me for quite a while. Who knows, maybe I'll have a better idea than this grouping stack until then, but I'm kind of doubtful of that.

Written by shinmera