Lock fixing in Krita's tile manager.
ClosedPublic

Authored by lieroz on Jul 14 2018, 5:26 PM.

Details

Reviewers
dkazakov
rempt
Group Reviewers
Krita
Summary

There were 3 main hotspots:

  1. m_mutex in KisTiledExtentManager, was fixed by changing manager container access logic
  2. m_listLock in KisTileDataStore, was fixed by changing blocking list for lock free map
  3. mutex guard in boost pool, was fixed by introducing simple cache

Diff Detail

Repository
R37 Krita
Lint
Lint Skipped
Unit
Unit Tests Skipped
lieroz created this revision.Jul 14 2018, 5:26 PM
Restricted Application added a reviewer: Krita. · View Herald TranscriptJul 14 2018, 5:26 PM
lieroz requested review of this revision.Jul 14 2018, 5:26 PM
lieroz updated this revision to Diff 37859.Jul 16 2018, 10:08 AM
dkazakov requested changes to this revision.Jul 16 2018, 10:38 AM

Hi, @lieroz!

I have tested your patch: painting benchmark passes here, painting seems to work as well.

Codewise the patch looks okay, except of the few easily fixable problems (see inlined comments). Please fix them and we will be able to start public testing of it.

There is also a small theoretical/design problem of the current implementation of KisTiledExtentManager, though I'm not sure we should fix it or not. Ideally, we should, but it might affect performance, so it needs testing. The problem is the following:

  1. Imagine the current extent is (0,0 64x64)
  2. Some thread calls notifyTileAdded(64,64 64x64), so that the final extent should become (0,0 128x128)
  3. In the meantime, some other thread is constantly calling to extent(), and it might get the following values:

extent() -> (0,0 64x64) initial state, correct
extent() -> (0,0 128x64)
intermediate state, incorrect
extent() -> (0,0 64x128) intermediate state, incorrect
extent() -> (0,0 128x128)
the final value, correct

In the ideal world, such behavior when we reveal internal intermediate state into the outer world are considered as incorrect behavior. In our case, such behavior is just weird, but it shouldn't case too much trouble. So it can be "fixed" by just adding a comment explaining it.

Conclusion: either fix this weird behavior or just add a comment about it.

libs/image/tiles3/KisTiledExtentManager.cpp
72

[RACE CONDITION] This line is not synchronized with line 57. Between lines 57 and 72 Data::remove() might have already been called and m_buffer[currentIndex] might have already become null again. The two calls must be linearized into two distinct points, right now they are not.

A possible solution, double locking over m_extentGuard: first read, then write. But that it just one of the options.

108

[RACE CONDITION] The same race condition as in add().

152

I think m_opacity and m_offset should be adjusted separately. Why growing into both directions, when the user might not need it?

229

This method is quite inefficient because it has to take (2 * N + 4) locks, where N is indexes.size(). Please move this method down into Data object to avoid that many locks.

libs/image/tiles3/KisTiledExtentManager.h
61

Will it be correct to call this variable m_migrationLock? If it is correct, it would be a bit easier to understand the code :)

libs/image/tiles3/kis_tile_data_interface.h
256

Forgotten comment, please remove :)

libs/image/tiles3/kis_tile_data_store.cc
140–143

Could you make some kind of unittest for the clock iterator testing the case when the element pointed by the clock iterator gets removed and the next element doesn't exist?

As far as I remember, in this case iterator should just restart from the very beginning of the list and make at least one pass. With the new code, I'm not sure it works correctly or not...

This revision now requires changes to proceed.Jul 16 2018, 10:38 AM
lieroz updated this revision to Diff 37966.Jul 17 2018, 4:01 PM
lieroz marked 7 inline comments as done.Jul 17 2018, 4:05 PM
lieroz added inline comments.
libs/image/tiles3/kis_tile_data_store.cc
140–143

Yeah, here was error, i tested it manually and resolved it.

lieroz marked 2 inline comments as done.Jul 17 2018, 4:05 PM
dkazakov requested changes to this revision.Jul 18 2018, 6:54 AM

Hi, @lieroz!

Concurrency-wise the patch looks correct now. The observable intermediate state problem is still here, but it should not block the patch.

The only thing that still blocks the patch is m_offset = m_capacity / 2 dependency. Please split them. We don't need to store (and iterate every time) -128 cells when image size is 10000 pixels. The negative part in such cases is only about 500-1000 pixels, which is 10-12 cells.

I would also still prefer to have a unittest for the clock iterator problem, but it is only a wish, it doesn't block the patch.

So, please split the offset part of the array, and then we can merge the patch into master.

This revision now requires changes to proceed.Jul 18 2018, 6:54 AM
lieroz updated this revision to Diff 38037.Jul 18 2018, 7:01 PM

Fixed clock iterator test and logic.
Implemented new extent manager migration strategy.

dkazakov added inline comments.Jul 19 2018, 7:13 AM
libs/image/tiles3/KisTiledExtentManager.cpp
11

Please make some default value like 4 or 8 :)

178

Why don't you just multiply the negative part of the range? It should grow like positive part (by multiplication), but separately from it.

libs/image/tiles3/kis_tile_data.cc
38

Is this check really needed? I thought it is done at the higher level in the store, don't we check it twice now?

dkazakov accepted this revision.Jul 19 2018, 7:24 AM

Looks correct now. Minor issues can be fixed later

This revision is now accepted and ready to land.Jul 19 2018, 7:24 AM
lieroz marked 2 inline comments as done.Jul 19 2018, 7:26 AM
lieroz added inline comments.
libs/image/tiles3/KisTiledExtentManager.cpp
178

There are little negative cells, so this way it doesn't occupy much memory.

libs/image/tiles3/kis_tile_data.cc
38

It is needed just for test, cause swapper unregisters need data in map.

lieroz marked 4 inline comments as done.Jul 19 2018, 7:49 AM