Krita dependencies management issues
Open, Needs TriagePublic

Description

Problem definition

Since recently we adopted builds of ffmpeg, heif, webp and other heavy libraries in Krita. Which raised the problems we tried to ignore for ages.

  1. Our deps builds are not "granular". We build them in one run/blob, and that is a problem now. The full build take 54 minutes on 16-core server (binary-factory). Performing these deps build locally to set up a development environment is even more problematic (since few people have 16-physical-core CPU).
  2. The whole deps setup is really fragile. Our CMake scripts sets up different environment for build and install commands (hello, meson and its PYTHONPATH!). It means that when you try to build some dep manually, you can break the whole build tree very easily.
  3. The whole setup is really hard to maintain. I spend about 10% of my time just fighting the deps build issues.
  4. Ideally, we need multiple prebuilt versions of Qt available from CI: normal, debug and asan. It is impossible to achieve with the current setup because of the absence of granularity. Doing 3x2 54-minutes builds on CI would just paralyze the KDE's CI system.

Requirements for the deps management system

Here is the draft of the requirements to the management system:

  1. [granular builds] The dependencies should be built in a granular way. That is, when we change something in Qt, we shouldn't rebuild FFmpeg.
  2. [prebuilt deps] The prebuilt packages for the deps should be available for the developers. At least for Linux and Windows.
  3. [granular downloads] Ideally, the deps pack for developers should be downloaded in a granular way as well, but one-blob solution will also work (since the whole pack of compressed deps is about 200MiB).
  4. [work on deps] The developer should be able to rebuild one dep (e.g. Qt or FFmpeg) easily without manual composition of the configure/build commands. And it shouldn't break the dev-environment! ;)
  5. [reuse build tree] Ideally, if the developer builds all the deps locally, he/she should be able to reuse the build trees for his local changes in a dependency. Obviously, we need to fix bugs in Qt quite often.
  6. [multiple builds of qt] We need a way to quickly switch the version of Qt between normal, debug and asan.
  7. [ci-toolkit for building packages] If we change one dependency (say, Qt), CI should automatically rebuild the dependent packages. Non-relevant packages should not be rebuilt.
  8. [separate deps sets] allow having different set of deps for @master and @stable

Available options and evaluation

We have multiple options to solve these issues:

CMake-based scripts

Our current CMake system is "almost perfect". It fits our workflow quite well, except that it is not granular and is quite hard to maintain.

Craft

KDE has a home-grown package management system called Craft. I tried to use it but faced with the fact that there is extremely little documentation about that. And the only manual about installing it required me to run PowerShell (which I'm not very familiar with) with Administrator privileges. I gave up on this step.

vcpkg package management system

I briefly looked into it. It seems like:

  • it is quite MSVC-centric
  • (it seems like) there is no readily available solution for storing the binary packages, we need to invent/host our own.

Conan package management system

I have managed to give a small test to Conan system. And here is what I found:

  1. It is quite easy to build packages in a granular way with Conan (obviously)
  2. We may have multiple binary-compatible versions of the same package, just declare custom binary compatibility rules in package_id()
  3. The main idea of Conan is:
    • every package is first installed into its own root folder
    • then Conan generates a custom CMake/Meson configuration file which adds paths to these packages. There are three (relevant) generators available:
      • "CMakeDeps": generates XXX-config.cmake file for each package which can be consumed by CMake to find the package. Please take it into account that this method basically overrides the original XXX-config.cmake files of the packages. Which might be a problem in some cases.
      • "CMakeToolchain": generates conan_toolchain.cmake and CMakePresets.json files. These files just adjust CMAKE_PREFIX_PATH (and other variables), so that CMake could find original XXX-config.cmake files of the packages themselves.
      • "MesonToolchain": generates a the same toolchain file for Meson build system
    • [note] these generators are usually enough to build and link the application, but they are usually not enough to run the application. The reasons are:
      • Qt has a plugin system. These plugins are usually looked at a position relative to QtCore.dll or krita.exe (TODO: I'm not sure what locations are actually looked at)
      • KDE has plugins, which should also be discoverable
      • Qt and KDE have translations, which should be discoverable
      • Krita installs its own plugins in its own location
      • PyQt and Krita install .sip files into a common location
      • libmlt has a plugin system. The plugins are searched relative to libmlt.dll. Currently, Krita's plugin is built alongside the MLT itself, but we might want to build it separately one day
    • To solve the plugins discovery problem, we can merge all the deps into a single tree using "deploy" functionality of Conan. Basically, Conan can merge all the packages into one tree. See deps-deploy/flat_deply.py in the linked repo.
      • the problem of flat_deploy.py approach is that Conan stops managing this merged folder. That is, if you change any deps (through Conan), you will have to regenerate the whole merged folder. Which might be a trouble in some cases (see below).
  4. [PROBLEM 1] The first problem I noticed is that Conan moves/relocates package build trees when the build is done. It basically makes requirement [reuse build tree] broken. Theoretically, we could move the folders back into the temporary location or hack conan (it's raw python) to keep the build trees intact, but I'm not sure that would work.
  5. To fulfil [work on deps] requirement, Conan has the concept of "editable packages". What it does is:
    • it builds the package in a custom user's location without installing or deploying it (only make all without make install)
    • then it generates XXX-config.cmake or conan_toolchain.cmake file that links the consumers to the build directory (not install/deploy directory) of the package
    • [PROBLEM 2] Linking to the build folder is really a clumsy approach. Firstly, the plugins will not work. Secondly, there might be some weird issues with Qt generating headers on the install stage (though I guess Qt actually generates headers on the configure stage using that dreadful perl script).
    • [PROBLEM 3] If we decide to use "deploy" strategy, this "editable" package will not link easily into the workflow. Because the we will have to regenerate the whole deploy folder after every change in the editable package (TODO: I'm actually not sure if deploy step is applicable to the editable packages).

That is, Conan somewhat solves the problem of building and deploying the deps in a granular way. But it has some issues with hacking on those deps. I have a feeling that these issues may be overcome somehow, though I'm not sure how at the moment.

Conclusion

It seems like Conan fulfils most of the requirements, except of the "work on deps" ones. Which is quite bad. But I have a feeling that these problems can be resolved somehow, it just needs more investigations work...

Repository with my experiments: https://invent.kde.org/dkazakov/krita-conan-deps

RequirementsCMake External ProjectConanSomething else?
[granular builds]noyes?
[prebuilt deps]yesyes?
[granular downloads]noyes?
[work on deps]yescomplicated, can be improved??
[reuse build tree]yesneeds hacking conan??
[multiple builds of qt]noyes?
[ci-toolkit for building packages]TODOTODO (--build cascade or more high-level)?
dkazakov updated the task description. (Show Details)Apr 17 2023, 10:17 AM
dkazakov updated the task description. (Show Details)Apr 17 2023, 10:28 AM
dkazakov updated the task description. (Show Details)Apr 17 2023, 10:40 AM

Hi @dkazakov! ๐Ÿต

I see you've done quite extensive digging on the subject: ๐Ÿ‘ That being, said, there are a few nitpicks and elephants-in-the-room after a cursory reading.

Craft
I tried to use it but faced with the fact that there is extremely little documentation about that. And the only manual about installing it required me to run PowerShell (which I'm not very familiar with) with Administrator privileges. I gave up on this step.

Excellent catch regarding the documentation, that's why I wrote it off my head when Ben suggested it. (Also needing Administrator access is never a good idea.)

vcpkg package management system

  • it is quite MSVC-centric

I'd agree in that it's Windows centric, support for MinGW is first class. However, their ports have a tendency to prefer, for this latter case, accessing the MSYS2 shell environment rather than relying on the CMake port. This was one of the reasons why I'd suggested to use MSYS installations and point-in-time tarball them for stability, rather than using standalone MinGW toolchains.

(it seems like) there is no readily available solution for storing the binary packages, we need to invent/host our own.

Your assessment is correct; like Meson's wraps and Krita's 3rdparty, vcpkg is intended for building dependencies from source.

Conan package management system

[PROBLEM 2] Linking to the build folder is really a clumsy approach. Firstly, the plugins will not work. Secondly, there might be some weird issues with Qt generating headers on the install stage (though I guess Qt actually generates headers on the configure stage using that dreadful perl script).

If you recall my experience getting an update for OpenSSL, the Conan formulae are, from a surveying point of view, black boxes. Yes, they are in fact Python scripts, but with little to no guidance on how to actually set them up. (Looking up how OpenSSL is built for Conan is an exercise for a whole day.)

This kind of scripting is a bonus, since we can tune them however we need; but they need documentation and also agreement on the Python version baseline.

Other alternatives

ASWF's Docker system

The ASWF dependency management system for the VFX platform images relies on compositing Docker images. It moves the 3rdparty complexity to a set of Dockerfiles, which may be more than we can chew.

However, a prospective user could just unpack the resulting image and get the artifacts out. This is actually what Homebrew, the macOS package manager, does after they moved to GitHub Actions.

Pacman

The other alternative that comes to my mind is setting up a proper Arch/MSYS-like build toolchain. This would be a 1:1 mapping to how we manage our packages, and would also enforce rebuilding only the modified ones. However, I have no idea how Arch and MSYS's CIs do this step.

dkazakov added a comment.EditedApr 17 2023, 1:56 PM

Hi, @lsegovia!

The ASWF dependency management system for the VFX platform images relies on compositing Docker images

Yes, such approach could work, and that is actually something that we use for the docker development environment. The problem is that this approach doesn't solve the granularity issue: all our deps for the docker image are a single binary blob.

Pacman

Yes, that could be an solution, though pacman explicitly depends on bash, which would require us depend on the MSYS2 environment. We have mixed opinions on whether we should use MSYS2 or not :)

If you recall my experience getting an update for OpenSSL, the Conan formulae are, from a surveying point of view, black boxes

Well, Conan has a special interface for mapping the contents of the "build" folder into variables, as if it would be an install folder (see self.cpp object in the layout() method: https://docs.conan.io/2/reference/conanfile/methods/layout.html). The problem is that we should manually construct this mapping for each (non-trivial) package we are going to use in "editable" mode. It also doesn't solve the runtime setup problem.

I haven't built the deps in quite a while but I'll try to respond a bit from what I remember that hopefully isn't too outdated:

Our deps builds are not "granular". We build them in one run/blob, and that is a problem now. The full build take 54 minutes on 16-core server (binary-factory). Performing these deps build locally to set up a development environment is even more problematic (since few people have 16-physical-core CPU).

This may be fixable for the CMake External Project setup. Instead of using a shared install prefix for all deps, we can use separate prefixes for each dep (or groups of dep) so they get installed separately, allowing them to be archived each on their own. To make the deps be able to find other deps, we can pass multiple prefixes (semicolon-separated) to CMAKE_PREFIX_PATH. (Not sure how well this works with Meson.) Qt may be a bit of a problem because of how it finds dependency -- we may have to specify each prefix manually (e.g. ZLIB_PREFIX=xxx) or actually merge all the deps it depends on into a single prefix. We would also need to have all bin/ dirs in each prefixes on the PATH so the build-time tools can run.

Then in our 3rdparty tree, perhaps we can provide options to override individual ExternalProject with prebuilt packages.

[granular builds] The dependencies should be built in a granular way. That is, when we change something in Qt, we shouldn't rebuild FFmpeg.

This actually should work with the CMake External Project setup. As long as their ExternalProject definitions doesn't DEPEND on one another, just (re)building ext_qt should leave ext_ffmpeg alone. What makes it not work now?

[multiple builds of qt] We need a way to quickly switch the version of Qt between normal, debug and asan.

I recall Qt intentionally adding a d suffix to all DLLs built in debug configuration though, to intentionally prevent swapping debug and release binaries. Is it only for MSVC? (On MSVC you cannot mix the debug and release CRT/C++ runtime because they have incompatible ABI.)

Just my two cents, as working on macos means using the entire 3rdparty project and we have being dealing with some of the issues. The approach might be a bit naive since I have most experience on macos

For easy building we have been working with a script to call cmake 3rdparty project this approach let us isolate and replicate the same environment for each build and rebuild. This allowed us to build the entire tree, or build/rebuild one package, however, the installing of deps became monolithic when preparing for universal fat-bin as my first approach was to build all deps tree for each arch and then create the universal binaries version, this was easier but meant that any dep change needs to rebuild the entire tree twice (around 2hr 45min on the _M1), this was not a big issue when deps where more or less stable but currently that is not the case. I'm finishing a patch so each dependency is compiled as a fat-bin on its own. so cmake --build . ext_<pkg> can work as before.

[granular builds] The dependencies should be built in a granular way. That is, when we change something in Qt, we shouldn't rebuild FFmpeg.
[prebuilt deps] The prebuilt packages for the deps should be available for the developers. At least for Linux and Windows.
[granular downloads] Ideally, the deps pack for developers should be downloaded in a granular way as well, but one-blob solution will also work (since the whole pack of compressed deps is about 200MiB).

For the case of granularity: Why not package them as distros do using DESTDIR? I encountered an issue when trying to compile some projects in multiple archs (meson, qt), and the only way I could achieve it was to compile them twice and install them to ${DESTDIR}-<arch>, the files and then made into a fat-bin DESDIR-<uni> and finally tar-ed and installed by untaring. A nice subproduct of this is that I ended up with a built tar version of each dependency that used this approach. We could incorporate a similar approach for CI builds and install to DESTDIR and tar them to finally save them someplace in server. We would need to add custom ExternalProject or maybe (better?) FetchContent to our cmake scripts to download and untar the file into prefix if available or build it not available, but the idea is that it would not build and already built package and CI would have each package individually (even if the only thing we changed was adding DESTDIR -> TAR -> install, and save the TAR somewhere in server, we could already get the compiled dep separatedly after a full run to allow download each file and avoid further compilation)

[work on deps] The developer should be able to rebuild one dep (e.g. Qt or FFmpeg) easily without manual composition of the configure/build commands. And it shouldn't break the dev-environment! ;)
[reuse build tree] Ideally, if the developer builds all the deps locally, he/she should be able to reuse the build trees for his local changes in a dependency. Obviously, we need to fix bugs in Qt quite often.

As for building our deps versions manually, we could exploit the fact that ExternalProject_ADD URL and GIT does support local path to folders , while not the most elegant solution we could add the location as a variable (for each ext_) and change that package to use our local repo instead of the download file. This asumes we are using a script to set an environment to avoid messing the env. The cmake configure could end up needing just an extra variable (i.e. -DEXT_QT_LOCATION for replacing GIT on ext_qt or -DEXT_FREETYPE_LOCATION for URL field on ext_freetype) to compile the project from our local repo. This approach is not unrealistic since we don't need to prepare it for each ext_ but can be added as needed, since we patch some deps more than others.

The previous approaches does not solve the issue deps installed with the distro package manager collide with the 3rdparty project, and aside from removing all DEPENDS from each ext_ and use another .cmake file to declare target dependencies optionally turned off, I have no other idea.

[multiple builds of qt] We need a way to quickly switch the version of Qt between normal, debug and asan.

We would always need to build them 3 times, but if installed to a different DESTDIR, each build can be tar-ed and locally installed by untaring them. (fat-bin RelWithDebInfo Qt tared is 134M).

Hi, @alvinhochun and @vanyossi!

From @alvinhochun:

This may be fixable for the CMake External Project setup. Instead of using a shared install prefix for all deps, we can use separate prefixes for each dep (or groups of dep) so they get installed separately

From @vanyossi:

For the case of granularity: Why not package them as distros do using DESTDIR?

The problem is that you explain exactly what Conan and other package managers do! :) They use DESTDIR to install the packages into a custom prefix and then pack them. And above that, the package managers also track dependencies between the packages. They can rebuild only a portion of the packages when only one package changes. Yes, we can do that manually, but I'm not yet sure that it is good idea. We can end up inventing the wheel in the end :)

this was easier but meant that any dep change needs to rebuild the entire tree twice (around 2hr 45min on the _M1)

That is exactly what why I wanted to look into this problem. Lately I spend too much work time on just rebuilding the deps. This work can certainly be automated and delegated to computers.

dkazakov updated the task description. (Show Details)Apr 18 2023, 7:55 AM

UPDATE:

After yesterday's discussion on IRC I have added one more requirement to the system:

  1. [ci-toolkit for building packages] If we change one dependency (say, Qt), CI should automatically rebuild the dependent packages. Non-relevant packages should not be rebuilt.

It differs from the first requirement by the fact that "there should be existing binding for GitLab that allows rebuilding packages in such a way". As ar as I can tell, Conan has a partial solution to this problem with conan install --build ext_qt --build cascade. But raw CMake doesn't have anything for that (at least I couldn't find anything related).

dkazakov updated the task description. (Show Details)Apr 18 2023, 8:01 AM
dkazakov added a comment.EditedMay 10 2023, 7:21 AM

Notes on the KDE's CI system

  1. KDE CI does NOT use Craft
  2. KDE CI uses a set of custom scripts that fetch binary dependencies instead:
  3. To make the dependency buildable and packagable one should define .kde-ci.yml file for it (in its folder) and call seed-package-registry.py script from .gitlab-ci.yml (https://invent.kde.org/sysadmin/ci-utilities/-/blob/master/seed-package-registry.py)
  4. The docker container for the building has a set of cache folders mounted automatically, you can see the list of them in this script: https://invent.kde.org/sysadmin/ci-utilities/raw/master/gitlab-templates/linux.yml
  5. It is technically possible to call all these scripts locally to bootstrap the local environment

Summary

As far as I can tell, this approach combined with CMake approach we use currently can solve all the requirements, except the "multiple builds of qt" one. Though I have a feeling like it is possible to implement that.

dkazakov updated the task description. (Show Details)May 10 2023, 7:59 AM