Replacement of API Documentation System
Open, Needs TriagePublic

Description

The current set of tooling we have to support api.kde.org is now very old, and as such experiences a number of issues.

These mean that it is unlikely we will be able to replicate the setup exactly on a new server should that be necessary, and that from time to time projects documentation stops being generated correctly, a situation which is difficult (and in some instances impossible) to debug and then fix.

Currently it is based off a combination of several different systems, all inter-related with each other. These are:

  1. The original KDE 4.x era scripts, which in turn were inherited from KDE 3.x and 2.x.
  2. Some specialised KDE 4.x era tooling for QML based projects (known as Doxyqml)
  3. Support for projects using mkdocs (Sink and Kube)
  4. Support for projects using Docbook (RKWard)
  5. Frameworks era (KF5) tooling, known as KApiDox

The majority of our projects are using either 1) or 2) from above, with the exception of Frameworks and a couple of smaller projects, that use 5).

In addition to the issues noted above, the generation process is currently performed as a nightly cronjob, which has scalability limitations and has the side effect that the process is a single monolithic piece, further complicating debugging of the system (as you have to wait overnight for a new run to see if your fix did the trick and solved the problem)

As such, we need to completely replace the system as it currently stands, something which can be broken down into approximately three different components.

Part 1: The actual API Documentation generator.

This should run over an individual repository, and generate a folder of HTML files and other artifacts which represent the documentation for that project, along with a single metadata file describing that folder (for use by Part 2)

Both the folder that contain the actual documentation and the metadata file would be transferred to a web server to be served statically.

As we've previously been asked for QCH format files for people to download and use locally, providing these would also be done by this Part of the process.

Part 2: The Website Frontend.

This will scan those metadata files and use them to provide a list of available projects to choose from (with name, description, supported platforms, etc) and the various versions which are available.

This would need to be dynamic (so written in something like PHP) to avoid having to be regenerated every time a project updated it's API documentation.

Part 3: Arranges for the API Documentation to actually be generated and then uploaded to the server running the Website Frontend (Part 2)

This would be the glue that ties it all together. It would have a list of projects that had been enabled along with the branches to be covered, and from this would prepare a list of jobs that would then be provisioned and run periodically on the Binary Factory (when changes are detected in that project's repository for instance)

It would spawn Docker containers (which would be fresh, and contain nothing from any other run performed previously) which would then checkout the code and any tooling, perform the documentation generation, and upload the relevant artifacts to the server hosting api.kde.org (the website frontend)

Benefits of this Approach

Debugging of the system with the new arrangement would be substantially easier, as developers would be able to pull down the relevant Docker image and run the necessary steps within it to generate the API Documentation.

Because the logs produced by the Binary Factory contain all the relevant commands, this would be reduced to an exercise of reading the logs and copy-pasting commands (alternatively, you could read the Jenkins Pipeline), after which a developer would be able to use a local browser to check the result. This would remove the need for access to the server which generates the documentation in the event of any problems.

Additionally, because we would be transferring responsibility for documentation generation to the Binary Factory, we would be able to get an at-a-glance view and know very quickly should any breakage take place, rather than having to spot this within the current logs (which aren't regularly monitored)

Further, Doxygen is a utility which has in the past broken it's template compatibility. By performing the generation process within Docker, we will be able to easily determine the necessary dependencies for the API Generation process and track the versioning, making tracking down these breakages easier in the future (and giving us more control over when we upgrade Doxygen version within our environment)

Getting Started

Should someone be interested in getting this working, it should be relatively straight forward for them to start on Part 1 without needing any outside guidance.

Part 2 should be guided in part by Sysadmin and the Website team, to ensure it is able to be deployed on our systems at the end, and that it fits within the branding currently being used on KDE.org.

As we already have some tooling for handling project lists (for instance for Craft) it should be relatively straight forward to complete Step 3 with minimal work.

Restricted Application added a subscriber: sysadmin. · View Herald TranscriptNov 10 2019, 1:47 AM
lydia added a subscriber: lydia.Nov 10 2019, 2:06 AM
ognarb added a subscriber: ognarb.Nov 10 2019, 10:29 AM
jucato added a subscriber: jucato.Nov 10 2019, 10:33 AM

I can help with the part 2 ;) I also think it could be useful to reduce the number of supported systems (updating projects to KApiDox)

Hi,

some answers about what already exists and what is difficult in the proposition you made.

I'm writing only about what kapidox does, so anything below api.kde.org/others/ is not concerned.

I think that with the requirements you wrote, KApiDox might not be the right solution, and that even a Doxygen solution might always be problematic.

I know that Gnome folks wrote their own documentation tooling. I don't know if we want to go down that path, as it's more to maintain.

Should we change KApiDox? The requirement? Find a compromise? I don't have answers and who does chooses.

Part1:

What KApiDox does is

  1. building the documentation for each repo once to have the crosslinks
  2. rebuilding it again to make use of theses corosslinks
  3. organizing the output documentation in a coherent way.

The QCH Files are generated on the highest level (product) and the lowest (repo). They are a little bugged from what I've heard but the first step is already done.

One big problem I hit when working on it is that we have cross dependencies everywhere so it's not easy to build only one repo.
To be clearer, let's give some examples:

  • If you build the documentation of Attica, you need to put that in the right place within the Frameworks product. But which repo would build the common pages for the Frameworks Product ? And the main page?
  • You built the documentation of Attica, and it's nice, but now the documentation of all tier 2 and 3 depending on Attica might have broken links to the Attica documentation.
  • You build the documentation of PIM or Marble and that is nice, but how do they know where to refer KF5 and other KDE libraries?

It might not be unsolvable, but I didn't manage to find a way to do it.

Part2:

We moved away from PHP pages because it was a burden to maintain to static pages based on Jinja. I think going back to PHP would be a mistake. Jinja is great and simple (any other templating system would work too, though). If the idea is only to create the front page again, I think it can be regenerated every time a documentation is updated (it's not that complex to generate).

Or maybe I missed your point which was rebuilding something else than KApiDox, in which case my comment is less relevant.

Part3:

Read what I wrote above about cross links.

Other points:

I'm not sure about the gain of a docker image versus a simple Python script....

Oh and I forgot. Not only KF5 and small projects' documentation are generated by KApiDox.

I don't see Phonon, Okular, PIM and Marble as small projects.

Just sayin' :)

With regards to supported systems, yes we should eliminate both of the legacy KDE 4.x era tooling items at a bare minimum, if not everything apart from KApiDox.

In terms of the cross dependencies, I would ignore those for now. Whilst not perfect, it does give us "near enough" API Documentation.
This would allow us to build all the various items separately.

At some point in the future we could look to offer the necessary .tag files (or whatever else it is that Doxygen uses for this purpose) for cross dependencies, but for now that isn't a major concern I don't think - and certainly isn't something that should block the replacement of the system.

In terms of generating the front page, this responsibility would now be handled exclusively by Part 2, and would be done in realtime on the frontend server, based on the API Documentation that has been uploaded to it - hence why I suggested use of PHP. KApiDox would not be involved in this at all.

In terms of generating the front page, this responsibility would now be handled exclusively by Part 2, and would be done in realtime on the frontend server, based on the API Documentation that has been uploaded to it - hence why I suggested use of PHP. KApiDox would not be involved in this at all.

Such a dynamic front-page is one solution but maybe not the right one. Also if you go down this path you'll have also to generate the "product" pages like in pim and kf5.

Frameworks would be listed on the main index page, as it is our principal product and what the vast majority of visitors to api.kde.org will be interested in.

As for PIM, it would be listed in amongst all the other projects that decided to have API Documentation generated for them.

ochurlaud added a comment.EditedNov 11 2019, 11:06 AM

OK... it seems you have been very precise plan and that what I've been trying to do was wrong or not manageable.

I'm kind of sad of my failure there, but well... Good luck!

Frameworks would be listed on the main index page, as it is our principal product and what the vast majority of visitors to api.kde.org will be interested in.

As for PIM, it would be listed in amongst all the other projects that decided to have API Documentation generated for them.

It's also too bad you were not present at CERN when the community decided how to handle the API docs. 5 years after you are basically saying everything can/should be redone.

@ochurlaud At the time those plans were drawn up, the current sort of system that I am now proposing would not have been possible.
The work you've accomplished certainly hasn't been a failure, and the vast majority of the infrastructure within kapidox should be able to be used with minimal modification.

The only part that won't be re-used will be the index page generation - because that needs to be done dynamically, and kapidox is not a dynamic generator (but rather a static one).

aacid added a subscriber: aacid.Nov 13 2019, 10:48 PM

It feels for me that the major problem with the current infrastructure is that we support 5 documentation tools. Is there a reason why Kube and other apps use another tooling? Do these projects have other needs? Does the maintainer prefer another tooling? Or did nobody have the time to update the documentation tooling for these projects?

That is certainly part of the problem.

I have no idea why Kube and it's associated projects opted to use mkdocs, only @cmollekopf can answer that.
Likewise for RKWard using Doctools based systems for their plugin documentation - only @tfry would know why that is done.

These projects though are relatively low cost to support.

The more problematic part is the older KDE 4.x era tooling along with the Doxyqml tooling. Should there be issues with any of these they are very difficult if not impossible to diagnose, and it is also only possible to run them in bulk over a tree of repositories so testing to see if you've solved the issue often requires waiting overnight for the next run (which takes several hours to complete on a server)

This is also part of the reason why i'd like to break the runs of kapidox (the new generation KF5 tooling) up to help make the results more repeatable, and not dependent on little bits and pieces of setup on the server - so we don't end up in this situation again.

I tried to port a small app to KApiDox and this looks quite easy: D25387

And it looks like KApiDox still use Doxyqml for the generation of documentation for qml file.

The usage of Doxyqml within KApiDox is fine - it's the variant of the KDE 4.x tooling that uses Doxyqml which is the potential concern here.

tfry added a comment.Tue, Nov 19, 6:13 AM

As to why RKWard uses docbook: Our plugin documentation is really more a tutorial style documentation. The actual "reference" part is less than a quarter of the whole doc, and cannot (currently) be trivially generated from headers.

Having automated builds of the plugin documentation is not exactly mission critical to us, although it is sure nice to have.

bshah added a project: KF6.Sat, Nov 23, 5:12 PM

For frameworks IMO it is quite important to have links between the individual documentations quite from the start. Due to their splitted nature, it is common that you look at the documentation of one framework and then will visit several more frameworks in the process of reading the API documentation.

@bcooksley @ochurlaud actually, I would like to have a deeper look into how to make the generation of the cross-links in the scope of KF5 more reliable. Would it be a reasonable step in your opinion to try the following:

  • create a Docker image that self-contained builds everything from KF5, actually being an Imagefile that lives in KApiDox
  • solve the cross-linking problem in KF5 (my approach: build tier 1, build tier 2, build tier 3 in a topologically sorted list, which should be computable from the available meta-data)
  • ensure that the full KF5 documentation can be built by running the image

For the most part that sounds okay.

Please note though that in order to have this integrated within either the current Binary Factory (Jenkins) or within Gitlab CI, it is a requirement that they be able to control the commands that are run (we'll also need that control in order to transfer the files to the web server that will host them). If you could make it so the process in question is one where you just run a command within the container that would make this easier to make work.

It would be nice if the generation process could be reused for other parts of KDE aside from Frameworks so we don't end up maintaining two separate systems: one for frameworks, and one for everything else (because a project is bound to come along with a collection of repositories).

As a starting point, I would suggest looking at the kdeorg/staticweb Docker image we already use for our Hugo/Jekyll/Sphinx builds.
Should you wish to create a totally separate image, please note that Redhat/Fedora images are not permitted to be used on our systems.

In order to gain some more insights, I created a first proof of concept container (based in the OpenSuse13 CI container) to check where I will find problems in this approach. What I understood so far:

  • It is tricky to figure out all the repositories that have to be checked-out for generating the documentation. For my PoC, I just relied on kdesrc-build (which works fine), but I do not like the close coupling of both tools, as it might break the setup in the future.
  • The bigger issue is that for generating dependencies diagrams for the individual modules, one has to execute a successful cmake configuration run of the module, which results are then re-used for generating the Graphviz dot files. Thus, all dependencies of a framework must be installed before running the dependency diagram generator. And thus, for the case of frameworks, either the frameworks must be built and installed in the correct dependency order as needed for the builds, or the pre-built frameworks must be installed as build artifacts in the container to allow a proper cmake configuration. Here, IMO both solutions are bad, because for the first, we would duplicate the CI system system (with all dependency handling) with duplicates the maintaining effort. For the second approach, this would tightly couple the documentation container to the CI, actually more tightly than I feel comfortable with (because it makes it pretty hard to analyze problems if you are not a sysadmin).

Wrapping up, I admit that Ben's initial approach of splitting the documentation generation seems more fruitful :) Yet, after what I learned, I would propose it slightly different:

  1. I think that we should consider KApiDox as a Tier-0 framework (like extra-cmake-modules), which is mostly a cosmetic change, but formally allows building it before any framework.
  2. I would like to extend the CI containers with the dependencies needed for running KApiDox (a few Python3 modules) and introduce documentation generation as an additional step in the build pipeline. This will solve the following problems:
    • sources are already checked out
    • the dependency diagrams can be generated (by definition) because the previous build step completed successful.
    • implictely by the CI dependency system that defines the sequence of framework builds, I expect the frameworks to be built in the correct order such that also the dependencies for Doxygen TAG files are fulfilled
  3. I would like to add an umbrella CI job just combines the individual api documentation artifacts from the jobs and creates the inter-framework links based on the TAG files.

@bcooksley @ochurlaud : what are your opinions? I must say that I did not yet look deeply into point 3, so there is a risk that this might not work.

dfaure added a subscriber: dfaure.Tue, Dec 3, 8:39 AM

About your very first point: for information, lxr.kde.org also relies on kdesrc-build to check out all the relevant sources. This has been working rather well for many years, I don't see that as "unwanted coupling of tools".
But OK, the rest of your email makes that point moot anyway :-)

With regards to the data needed for the Dependency Diagrams, this is why the CI system currently exports that information as part of it's builds. It's currently only available on the machine that runs api.kde.org, but we could probably make it publicly accessible - which we somewhat decouple the CI system and the API Documentation Generation.

In terms of generating the API Documentation as part of the CI jobs, i've no objection in principle to that, although we'll need a way to switch it on for the jobs which should have API Documentation generated, and a way to version documentation . The only part of this I don't understand is the inter-framework links, can you elaborate more on this?

  1. I would like to extend the CI containers with the dependencies needed for running KApiDox (a few Python3 modules) and introduce documentation generation as an additional step in the build pipeline.
  • If a project fails to compile, does it imply its documentation won't be updated?
  • When someone wants to make a minor change to HTML markup, does s/he have to wait for CI to recompile all KDE projects from source code before the change becomes visible at api.kde.org?
  1. If the compilation is broken, we have bigger issues as projects should never fail to build from source (for long anyway)
  2. That would be required under this approach yes. (a rebuild of all projects that have API Documentation anyway, i'd like to see this limited to those projects that support and request this be enabled, as use of kapidox does require some setup work in the project itself)

Very interesting ideas!

What if...

  • we remove from kapidox the dependency diagram generation. You have it somewhere already to order the CI jobs and for kdesrc-build. An http service api from that could be provided with two options: get the full dependency description / get a given product's dependency

description.

  • we add an http service api to get the tag files, that is updated every time an CI build succeeds.

When we have this, and except in case like kio that is a dependency of one of its dependency, we build the API docs based on the current status of the kde builds. Only the first generation needs the right order.

As a summary :
0) export and keep up-to-date the dependency structure to an accessible http service api

  1. build a lib
  2. create the kapidox documentation (for that you can use the http api from (0) to generate the diagram and the http api from (3) for cross links)
  3. export the doxygen tags to an accessible http service api

-> go back to (1) with the nex lib

With that, we could generate the libs's apidox separately.

Last issue: the front page + the product pages. :it would suffice to have a jinja/php page generated BY Kapidox and that the generation produces a folder with the docs (as currently) end a json file to be interpreted by the php/jinja page.

Risk (?) : if a public method or class from a dependency disappears or moves, we can have a dead link until next generation of this dependency. But I would say that it would mean that the build itself would have failed too..

Another idea about the issue raised for "tweaking the csss/html" :

First, if kapidox is part of the frameworks, it should behave as such and have a stable interface (how stable?)

Second, if it's part of frameworks, it should not be too heavily linked to the kde branding.

We could keep in kapidox an engine with a strict interface and a very simple EXAMPLE of html template and CSS (based on jinja or any other cool templating language) using the interface.
Without too much thinking, this interface would be the folders/description Jsons and php engine.

We would have an implementation of the template with the right css (kde branding etc) stored somewhere else as we now do for kde.org, which life cycle would be independent of the CI and would overload the examplr

Unfortunately we can't reuse the metadata that the CI system and kdesrc-build use, as some Frameworks have two libraries in them - with one often not having the GUI components in it (and thus having a much smaller dependency graph). We also don't record any information about anything external to KDE (like Qt).

Because Qt and Frameworks have a BC/SC promise for the most part, I wouldn't expect methods/classes to just disappear for the most part - if they do it'll be the exception rather than the rule (and the ability to build API Documentation for things beyond Frameworks is just an added bonus - and I wouldn't expect anything to depend on those items in any case, so shouldn't be any links to break)

Unfortunately we can't reuse the metadata that the CI system and kdesrc-build use, as some Frameworks have two libraries in them - with one often not having the GUI components in it (and thus having a much smaller dependency graph). We also don't record any information about anything external to KDE (like Qt).

So this should be worked on as well...

Could you clarify what needs working on here?

If we want to generate the doc of the libs one by one, it means that we need to have the order of generation.

To have the dependency diagram we need the input of dependencies.

This should come from somewhere because kapidox won't have it.

The CI system can provide the Dependency Diagram information easily enough for this - we provide it already for the existing builds (not sure if it is used though)

@ochurlaud: I was under the impression that KApiDox is already a framework, since it is listed along the Tier 1 frameworks... Anyways, I would be great to have it providing a more general scope than just KDE specific API documentation, but I think for now it is OK to just focus on our needs and maybe during the KF6 branching phase re-evaluate if one can make it also useful for others.

Regarding build dependencies, we already get all we need implicitly during the CI organization: The CI system installs all needed build dependencies from the manually organized metadata files. When a framework is built, CMake runs successfully, which is checked by the CI. So, at that point all dependencies are there and a run of KApiDox's dependency generator via CMake will successfully generate the dependency diagrams (actually, I much prefer the CMake based approach over re-using meta-files because meta-files tend to not being updated when dependencies are removed). Moreover, due to the ordering of CI builds, TAG files for frameworks in lower tiers will always be generated previous to generating the documentation for frameworks that need these dependencies. So, there shall not be any dead-links, especially now after all dependency cycles are removed in in frameworks :)

So, in my opinion, the next step should be to get the generation of API documentation into the CI builds.
The step after that then should be to generate the existing Frameworks API website based on those artifacts and replace the current generation.

cordlandwehr moved this task from Backlog to In Progress on the KF6 board.