Dmitry's notes from CppConfPiter'19
Open, Needs TriagePublic

Description

Shared pointers are not good for hierarchies

Several(!) unrelated people in their talks mentioned that using shared pointers to represent hierarchies is considered a bad practice now. We do it for KisNode, and we have exactly the same problems they mentioned: memory leaks, cyclic dependencies and construction problems. We had two patches that were targeted to refactor our raw-pointer problems, but both failed because of the construction problems here and there: [1], [2].

[1] - https://phabricator.kde.org/D3279
[2] - https://phabricator.kde.org/D3307

Instead of storing a hierarchy in shared pointers like that:

struct KisNode {
    KisNodeWSP m_parent;
    QVector<KisNodeSP> m_children;
    
    KisNodeSP parent() const;
    KisNodeSP firstChild() const;
    KisNodeSP lastChild() const;
    KisNodeSP nextSibling() const;
    KisNodeSP prevSibling() const;   
};

It is recommended to store hierarchy in a separate class, named "forest":

struct KisNode {
    // only business logic inside, no hierarchy
};

// preferredly, no pointers should be used, only values and references
using KisNodeSmth = KisNode;

// as a fallback option the forest may also operate with shared pointer
// using KisNodeSmth = KisNodeSP;

struct KisNodeForest {
    void addNode(...);
    void removeNode(...);
    void moveNode(...);
    
    // these methods should not be used, unless it is really needed
    KisNodeSmth parent() const;
    KisNodeSmth firstChild() const;
    KisNodeSmth lastChild() const;
    KisNodeSmth nextSibling() const;
    KisNodeSmth prevSibling() const;

    // instead, more high-level itegarion algorithms should be used
    // we already have functions like that in KisLayerUtils
    
    template<class Func>
    void iterateSubtree(KisNodeSmth root, Func func);
    
    template<class Func>
    void iterateChildren(KisNodeSmth parent, Func func);
    
    template<class Func>
    void iterateSubtreeReverse(KisNodeSmth root, Func func);
    
    template<class Func>
    void iterateChildrenReverse(KisNodeSmth parent, Func func);
    
    template<class Func>
    KisNodeSmth findNode(KisNodeSmth parent, Func func);
};

The benefits of this approach:

  1. The logic of parent-child relations is split into a separate class (we tried to do that with splitting KisNode and KisBaseNode, but failed)
  2. The problem of node construction/cloning is solved (at least in case when KisNodeSmth is not a shared pointer).
  3. Different tree/iteration strategies can be implemented easily. Right now, we have separate KisProjectionLeaf and KisDummiesFacade classes to implement diffferent iteration/decoupling. If we have a "forest" with lazy-copying capabilities, tese problems would be solved easier.

Conclusion

The fact that the forest if a better approach doesn't mean that we should drop everithing now and start refactoring Krita to a "modern c++ practices". Though, I think, we should take these considerations into account when designing new code and refactoring old one. E.g. our resources branch fully relies on shared pointers now. Perhaps we could benefit from using the shared-pointer-free registry approach?

GUI widgets updates/signals hell (e.g. in brush editor)

The problem of our brush editor is that update/change signals don't have any granularity. If one changes, say, brush opacity, GUI issues "preset changed" signals and the entire Krita GUI should be updated. This atom-bomb-update can easily cause cycling updates, partial updates and UI freezes. That is bad, and that is why we havev the brush editor broken at least twice per year...

I tried to address this problem several times, see e.g. KisDerivedResourceConverter. It kind of works, but is not very general and causes some boilerplate code. It looks like there is a general solution to this:

Library Lager (MIT license): https://github.com/arximboldi/lager
Video 1: https://www.youtube.com/watch?v=_oBx_NbLghY
Video 2: https://www.youtube.com/watch?v=y_m0ce1rzRI

This library's purpose is to do proper UI updates with decent granularity and to avoid signals hell like we have in the brush editor. The library is compatible with Qt(!). It also has some capabilities for doing iterative refactoring. That it, you don't have to refactor everything at once, you can do it incrementally in a bottom-up approach.

Conclusion

I think we should try the library in some new code. Or just next time, when the brush editor breaks :)

KisUpdateScheduler multithreading problem

To resolve the problem we should google for: "work stealing scheduler". Work-stealing approach supports lock-free implementation. It can resolve our 6+ cores problems.

Known implementations (none of them can be used directly, though):

Both libraries cannot be used directly. Staccato seem to not support thread sleeping during inactivity. And Boost.Fiber as a whole is just too weird for us. It demands code to use it's own mutexes, which is not acceptable. Though I have a feeling that at least a part of work-stealing code from Boost.Fiber can be reused.

AVX512 and std-simd library

Just some notes:

  1. AVX512 is only available in std-simd library, not in Vc
  2. Current implementation in std-simd doesn't support gather/scatter functions
  3. The paper proposed to C++23 doesn't include them either
  4. The C++ committee wants std-simd to become a standard in C++23
  5. The C++ committee also wants to get gather/scatter into the standard in time for C++23
  6. Mathias seem to be a bit busy atm and has no time to implement gather/scatter :)

GPU memory management concerns

We had an old question about "how to store image in GPU memory if it is too big?". It looks like CUDA now has internal swapping system, it is called "cuda unified memory". I guess openCL should also have something like that.

Link: https://devblogs.nvidia.com/unified-memory-cuda-beginners/

dkazakov created this task.Nov 4 2019, 12:17 PM
ognarb added a subscriber: ognarb.Nov 4 2019, 12:26 PM
dkazakov updated the task description. (Show Details)Nov 4 2019, 12:42 PM
tusooaw added a subscriber: tusooaw.Nov 8 2019, 4:51 PM
tusooaw added a comment.EditedNov 8 2019, 8:34 PM

This is very interesting.

I am wondering whether this would offer benefits to Flake, as the current shape hierarchy is still implemented with the more primitive raw-pointers.

As https://invent.kde.org/kde/krita/merge_requests/115 shows, Flake has a lot of thread-safety issues currently.
The introduction of strokes to Flake gives us a lot of memory issues due to pointers that are deleted when we want to access them.
Locking does help but it does so in a rather ugly way (we already see three locks in Flake).

But on the other hand, Flake is completely Copy-on-Write, maybe with the exception of containers (currently we copy-construct them by re-adding child shapes).

Probably we want to re-consider the role of KoShapeManager. Currently it forwards the changes of shapes to the canvas. Note there is a circular dependency: the KoShapeManager stores its shapes, and KoShapes need to know their shape managers. I wonder whether the structure you proposed would solve this problem.

Other things to consider:

(1) Edit shapes only in image thread. GUI thread may access it, in some ways, at the same time.

(2) Probably want to clone the shape (forest?) every time the GUI thread want to read it.
This takes O(1) time thanks to QSharedDataPointer. In this way, the reader and writer never access the same object.

(3) How to monitor changes? It might be possible to make individual shapes stored in the forest using constants (!) (KisDescendent<const KoShape>, you are back). When some function wants to modify a shape, simply create a clone for it and when the function returns, replace what is stored in KoShapeForest by the modified shape (and it is const again).

GUI widgets can subscribe to changes in shapes via KoShapeForest. But this means they should change their subscription to another shape, since the original one has already been replaced.

(4) Dealing with containers, such as shape groups. In some cases we want them to act just like a normal shape. Maybe just put a placeholder object there? Still something to think about.

Note there is a circular dependency: the KoShapeManager stores its shapes, and KoShapes need to know their shape managers. I wonder whether the structure you proposed would solve this problem.

If we decide to redesign shapes, we would probably need something like Lager library provides. Because one of the problems is the delivery and granularity of the updates.

Other things to consider:
(1) Edit shapes only in image thread. GUI thread may access it, in some ways, at the same time.

Limiting shapes editing to GUI only sounds really wrong. It is a common problem in the entire Krita. The only difference to shapes is that tiles engine give a guarantee on "read access is always safe", even when some other thread is writing into this position atm. It doesn't solve the original problem though. It only avoids data-losses for the user.

(2) Probably want to clone the shape (forest?) every time the GUI thread want to read it.
This takes O(1) time thanks to QSharedDataPointer. In this way, the reader and writer never access the same object.

Well, I'd say, it should be a reverse. Like read-copy-update approach. The working thread should copy all the shapes, works on them and then swap them back in an atomic approach.

(3) How to monitor changes? It might be possible to make individual shapes stored in the forest using constants (!) (KisDescendent<const KoShape>, you are back). When some function wants to modify a shape, simply create a clone for it and when the function returns, replace what is stored in KoShapeForest by the modified shape (and it is const again).

Well, it might be a working approach.

GUI widgets can subscribe to changes in shapes via KoShapeForest. But this means they should change their subscription to another shape, since the original one has already been replaced.

Yes, it may be a problem for the current code.

  1. Solution: we have a special proxy for the selection in the canvas, called KoSelectedShapesProxy. The same strategy may be used for the shapes themselves.
  2. Problem: there might be some functionality, like in patch 1fdb15d0dae77917b20c61033345fa6124499b35, which is linked to the state of the shape.

(4) Dealing with containers, such as shape groups. In some cases we want them to act just like a normal shape. Maybe just put a placeholder object there? Still something to think about.

Yes, this question is rather tricky. I think groups should still be a distinct object. Only hierarchy information should be stored in the forest. Basically, a group may take this info from there.

How to locate shapes/items in the Forest seems to be a problem since we currently do not have a way to identify them without pointers.

However what we really want to consider might be:

  • Should tools really need to locate individual items? or
  • What APIs should the forest expose to the tools?

Talking generally (not limited to shapes), tools want to

[1] Add an item (constructed) to the hierarchy in some specific position;
[2] Add an item (constructed) to the hierarchy in some specific position, with changes to current items in the hierarchy (Adding a shape to the layer might change the z-index of current shapes);
[3] Modify selected items, without changing other items in the hierarchy;
[4] Modify selected items, with changes to other items in the hierarchy (Moving things up/down; modifying group shapes; );
[5] Remove selected items without changing other items in the hierarchy;
[6] Remove selected items with changes to/deletion of other items in the hierarchy (Remove a group will delete all its children);
[7] Add items (according to cursor position) to the selection;
[8] Remove items (according to cursor position) from the selection;

What I mean is that it would probably be the best for tools to not indicate what items they want to modify, but to do it using some property, i.e. to only allow tools to operate on the selection. But this does not seem to be always possible.

Maybe it is possible to auto-generate ids for items when they are added into the hierarchy via e.g. KisNameServer, and store it in the forest as:

template<class T, class S>
struct KisForestNode
{
    using ItemType = S;
    using IdType = T;

    ItemType item;
    IdType parent;
    QList<IdType> children;
};

template<class T, class S, class R = KisForestNode<T, S>>
struct KisForest
{
    using IdType = T;
    using ItemType = S;
    using NodeType = R;

    QMap<IdType, NodeType> items;
};

@dkazakov These are interesting points and I'm interested in hearing more of your thoughts on implementing these ideas in future updates.

Regarding lager or value-oriented programming, you might also find this lecture by André Bergner pretty interesting and relevant. It's more focused on Audio/DSP programming than graphics programming, but arguably there are many overlaps between that and graphics processing. I was thinking this lecture would apply pretty well to something like Krita's filter systems to provide more responsive behavior where certain processes of a filter can be cached an only updated when they need to be updated (i.e. when a variable in the equation has actually changed per pixel). I'm not sure how well it could be applied as mentioned in the panel, but I thought I would share this video since it seems somewhat relevant to the discussion at hand.

As for the forest idea, wouldn't it also be possible to use something like a unique pointer so that nodes can be checked out of the forest as a weak pointer? I think that the biggest problem is that data-ownership is currently a bit unclear which is why cloning and duplication can lead to memory problems (having said that, I'm not as experienced with the Krita code base yet.) I at least think that would be a better fallback than using KisNodeSP.

tusooaw added a comment.EditedNov 16 2019, 10:59 PM

As for the forest idea, wouldn't it also be possible to use something like a unique pointer so that nodes can be checked out of the forest as a weak pointer? I think that the biggest problem is that data-ownership is currently a bit unclear which is why cloning and duplication can lead to memory problems (having said that, I'm not as experienced with the Krita code base yet.) I at least think that would be a better fallback than using KisNodeSP.

The problem with this is it might not be a value type -- You are able to change it, everywhere in the program, so it is hard to make it thread-safe.

I would propose the use of SharedPtr<const T> as suggested by Sean Parent (https://www.youtube.com/watch?v=QGcVXgEVMJg). A polymorphic version would be KisConstDescendent<T> (https://invent.kde.org/tusooaw/krita/commit/4b143e85de501f59dac537ccba5eb0c246a4442f)

Hi, @tusooaw!

You are quite right. Most of the tools don't need any pointer/id for the shapes. They can live with quite high-level finctions, like "get shape under cursor" or "get a vector of selected shapes".

There are quite rare cases, when a tool/widget needs to store a pointer to a shape. The most obvious example is "KoSelection" itself :) It needs to store a list of selected shapes and this list should not be invalidated on changes. Another example is KeepShapesSelectedCommand (or its equivalent in command-free implementation). Basically, on undo we should reselect shapes that were selected before. So, obviously, we should store their ids somehow.

Your code snippet looks perfectly correct :) Though the devil is in the detail, of course. I'm not sure how IdType should be implemented. This Id should be unique throughout entire lifetime of a document and it should be persistent, that is, the shape cannot change its Id after creation.

I guess it would be a good idea to check how QModelIndex and QPersistentModelIndex are implemented. Perhaps we could just copy their idea? I have a feeling as if QModelIndex just stores a pointer to the item, and QPersistentModelIndex uses some safer design :)

Hi, @eoinoneill!

Firstly, thanks for the video, I'll check it :)

As for the forest idea, wouldn't it also be possible to use something like a unique pointer so that nodes can be checked out of the forest as a weak pointer?

The main idea of "reactive programming" and "functional programming" is that we don't use any pointers at all :) We can use them internally, like @tusooaw replied above, but all the "user-level" code works with "immutable" (?) copies of the objects. I'm not sure how applicable it is to Krita, but the general idea is: "don't use pointers at all" :)

eoinoneill added a comment.EditedNov 20 2019, 9:31 AM

The main idea of "reactive programming" and "functional programming" is that we don't use any pointers at all :) We can use them internally, like @tusooaw replied above, but all the "user-level" code works with "immutable" (?) copies of the objects. I'm not sure how applicable it is to Krita, but the general idea is: "don't use pointers at all" :)

Ah hah, I see.

It would certainly be an interesting project to try to move more of the krita tools to a more asynchronous friendly workflow. The closest I've messed with this type of stuff is when programming in Rust which has the idea of strong ownership in order to make working with threads easier -- though I haven't tried full reactive programming in that regards. I'll read into this more myself because it's pretty interesting.

Another area where this seems to apply well is input processing. For instance, I could imagine it working pretty well as a replacement for Krita's current "action" concept (since events could then be merged together to make new actions, so to speak, if I understand the process correctly.) This could mean certain, quick button combinations within a buffer could create unique actions through a filtration/merging processing. Just an interesting thought based on what I read over the weekend.

Cheers!

I guess it would be a good idea to check how QModelIndex and QPersistentModelIndex are implemented. Perhaps we could just copy their idea? I have a feeling as if QModelIndex just stores a pointer to the item, and QPersistentModelIndex uses some safer design :)

https://code.woboq.org/qt5/qtbase/src/corelib/itemmodels/qabstractitemmodel.cpp.html#676

QPersistentModelIndex stores a QModelIndex. QAbstractItemModel stores a table of QPersistentModelIndex'es. When the derived class calls e.g. endInsertRows(), the model iterates through the table to change the corresponding indexes in QPersistentModelIndex.

I feel this is still not suitable for our use. First, QPersistentModelIndex cannot be used in different models, but we might be interested in this use case. Second, QPersistentModelIndex is not a value type (can, will, and, sometimes, must be changed from the outside) so it is inherently thread-unsafe.

What I did is use an integral type to represent id. A little bit like auto-increment unique id fields. Theoretically we have a upper bound for how many objects we allow (throughout the history of document), but this is the same case with z-indices, so I guess this should be acceptable.

Implementation of KisGenericForest:
https://invent.kde.org/tusooaw/krita/commit/cc71c9b5fb5af09bd7e1ccf2c54f615a655b7f77

dkazakov added a comment.EditedJan 29 2020, 7:32 PM

Ideas for the prospective multithreaded shapes implementation

Option 1: Write is always RCU (read-copy-update)

  1. GUI thread accesses the layer's shapes directly without any locks. GUI thread can only read shapes
  2. Worker thread can read and write shapes, but writes happen in RCU manner:
    1. worker thread makes a shallow copy of the shapes hierarchy
    2. worker thread makes modifications in the copy
    3. worker thread writes back the copy atomically

Problems:

  1. The final write should be atomic. That is, for the GUI thread the shapes can be observed either in the initial state or in the final state. No intermediate states should be observable (including during the write-back itself). 1.1) theoretically, it could be implemented by adding a lock to a shared d-pointer, but it doesn't guard from hierarchy action. 1.2) we could declare that locking the topmost shape would block hierarchy, but to get this topmost shape we should iterate through this hierarchy, which is unsafe
  2. After the swap operation GUI thread might still have pointers to the old shapes that we replaced. How to fix that?
  3. After swap operation, the shapes should emit some signals, so that layer's shape manager and other receivers get their notifications. How to postpone/duplicate these signals?

Option 2: Reads from GUI thread always get full copies

  • GUI thread works only via KoSelection. Every call from KoSelection returns a shallow copy of all the selected all the shapes of the hierarchy. We cannot clone only selected shapes, because hierarchy actions will not be available.
  • Worker thread works directly on the shapes of the layer without any copies (or using RCU)

Problems:

  1. We should still somehow make clone operations atomic. Though we can achieve that quite easily by syncing to strokes framework.
  2. Right now shallow copy takes about 1ms for decently complex document. But take into account that KoPathShape has no copy-on-write atm.
  3. The methods of KoSelection will have to return some entity that keeps lifetime of all the contained shapes. Quite a lot of code will have to be refactored?