Problem of resource deduplication and aliasing in Krita
Usecase 1: Aliasing of imported resources
- Resource creator releases bundle_v1.bundle, which includes brush.kpp brush, which links to shine.ggr gradient.
- User imports the bundle, but his folder storage already has his own shine.ggr gradient with a different content. Right now Krita will fail on that.
Usecase 2: Aliasing of versioned resources
- Now resource creator releases bundle_v2.bundle, which includes the second version of both brush.kpp brush, which links to a new version shine.ggr gradient.
- The user imports the second bundle (alongside the first one). The imported version should override the older versions of the resources. It should somewhat work with the current design (see "Related problem of renaming resources on import" below)
- If some brushes in the user's folder storage linked to the older version of shine.ggr from the previous version of the bundle, they should still link to the older version. The current implementation somehow handles that, because the version of the resource is encoded into its name.
Problem definition
- Some resources can link to other resources. E.g. a brush preset can link to a brush tip, a pattern and a gradient.
- We have no standard approach for identifying the resources we link to. In most of the places we link by a 'filename', but in others we also use 'md5' and 'name'
- when a resource has an embedded resource, we can load that into an internal storage, but it will still have to be linked as a filename, md5 or name
- the same resource can have multiple versions, which can be linked by other resources at the same time
So, the main problem is that we have no unique identifier that would globally identify a specific resource (and a version of a specific resource). We cannot uniquely identify the resource even inside a single installation of Krita using 'resourceId', because multiple versions of the same resource will have the same 'resourceId'.
As a side note, we cannot use md5 as a unique identifier of the resource easily, because every saveToDevice() operation on the resource will change its md5 (due to inconsistencies of version and settings in libpng).
Possible solutions
- Consider filename as a unique identifier of the resource and its versions.
- [pro] most of our code uses filename as a key to link to the resource already, so it should work out-of-the-box
- [con] we need to encode the version of the resource into the filename. That is already implemented, but that might be not very pleasant-looking from the user's point of view (or is it?)
- [con] two versions of the resource created on different machines will still alias, because they would have the same version, but different content. And this case will have to be resolved somehow.
- Implement "scoped resource url resolving". That is, when the user has two resources called "papergrain.pat" in two bundles, the filename will be resolved into the resource that is placed in the same bundle ("scope"), where the source rtesource is placed.
- [pro] we have a bit more freedom in how the resources can be named. They are allowed to alias, unless they are stored in the same bundle.
- [con] we have to significantly rewrite the resource fetching sites.
- [con] leaf-resources will have to what bundle/scope they belong to
- [con] there is still a usecase when this scheme breaks. If the user wants to create a brush preset in the folder storage that links to a (aliased) pattern from a bundle, he will not be able to do that. The version from the folder storage will always be preferred.
- Implement some form of 'uuid' for the resources. It can be stored either in the resource itself or in the bundle/storage metadata, like it happens with tags atm. ASL and ABR resources do already have such UUIDs.
- [pro] that solves the usecase ideally
- [con] it is difficult to store the id inside some types of resources, like .pat and .gbr. For them the ID will have to be stored somehow separately, in the bundle or storage metadata, alongside the tags.
Related problem of renaming resources on import
Right now the implementation uses the first approach, that is, filenames for the versioned resources encode the version of the resource.
Though there is a little problem right now. When a resource is imported via addResource it gets renamed to have a '.0000.' version in its filename. Which makes in undiscoverable for the resources that link to it.
I believe that the problem is in the fact that we "abuse" addResource to do things that it was not supposed to do. When we call addResource it saves a new resource from memory into the persistent storage. When performing this save operation, it does not only change/generate the name, but also changes md5-sum(!) of the resource. And we cannot avoid that md5-sum change, because it will always happen (due to different libpng version and/or settings).
To solve the problem we need to add a separate method importResource that would copy the resource into the persistent storage in byte-to-byte basis (using a QIODevice or filesystem URL, doesn't matter). This method would just forward the QIODevice or url to the storage and let it copy the data. Then both MD5 and filename will be kept the same.
And to avoid confusion in the future we could also rename addResource into something like addNewResource or createResource.