Appstreamcli test results flaky
Closed, ResolvedPublic

Description

Wondering about the alternating results for kdevelop in the appstreamcli test (see https://build.kde.org/view/Kdevelop/job/kdevelop%20master%20kf5-qt5/PLATFORM=Linux,compiler=gcc/) we found that on some runs the appstreamcli test exits earlier due to a java.io.IOException, then only reporting a few first results which still have been collected for some strange reason.

Example of a failed run:
https://build.kde.org/view/Kdevelop/job/kdevelop%20master%20kf5-qt5/PLATFORM=Linux,compiler=gcc/349/warnings3Result/NORMAL/

Choosing the last issue, and then clicking for its details, one gets this which might hint to the problem:
https://build.kde.org/view/Kdevelop/job/kdevelop%20master%20kf5-qt5/PLATFORM=Linux,compiler=gcc/349/warnings3Result/NORMAL/file.1576683315/source.1895296664987525003/#1

For the record, content copied here:

 The component summary should not end with a "." [Det ?r en funktionsrik,
01 Copying the source file 'org.kde.kdevelop.appdata.xml:org.kde.kdevelop.desktop:50' from the workspace to the build folder '5dfa4733.tmp' on the Jenkins master failed.
02 Is the file 'org.kde.kdevelop.appdata.xml:org.kde.kdevelop.desktop:50' a valid filename?
03 If you are building on a slave: please check if the file is accessible under '$JENKINS_HOME/[job-name]/org.kde.kdevelop.appdata.xml:org.kde.kdevelop.desktop:50'
04 If you are building on the master: please check if the file is accessible under '$JENKINS_HOME/[job-name]/workspace/org.kde.kdevelop.appdata.xml:org.kde.kdevelop.desktop:50'
05 java.io.IOException: Failed to copy org.kde.kdevelop.appdata.xml:org.kde.kdevelop.desktop:50 to /var/lib/jenkins/jobs/kdevelop master kf5-qt5/configurations/axis-PLATFORM/Linux/axis-compiler/gcc/builds/349/workspace-files/5dfa4733.tmp
06   at hudson.FilePath.copyTo(FilePath.java:1984)
07   at hudson.plugins.analysis.util.Files.copyFilesWithAnnotationsToBuildFolder(Files.java:80)
08   at hudson.plugins.analysis.core.HealthAwareRecorder.copyFilesWithAnnotationsToBuildFolder(HealthAwareRecorder.java:348)
09   at hudson.plugins.analysis.core.HealthAwarePublisher.perform(HealthAwarePublisher.java:89)
10   at hudson.plugins.analysis.core.HealthAwareRecorder.perform(HealthAwareRecorder.java:295)
11   at hudson.tasks.BuildStepCompatibilityLayer.perform(BuildStepCompatibilityLayer.java:81)
12   at org.jenkins_ci.plugins.flexible_publish.builder.FailFastBuilder.perform(FailFastBuilder.java:102)
13   at org.jenkins_ci.plugins.run_condition.BuildStepRunner$2.run(BuildStepRunner.java:110)
14   at org.jenkins_ci.plugins.run_condition.BuildStepRunner$Fail.conditionalRun(BuildStepRunner.java:154)
15   at org.jenkins_ci.plugins.run_condition.BuildStepRunner.perform(BuildStepRunner.java:105)
16   at org.jenkins_ci.plugins.flexible_publish.strategy.FailFastExecutionStrategy.perform(FailFastExecutionStrategy.java:63)
17   at org.jenkins_ci.plugins.flexible_publish.ConditionalPublisher.perform(ConditionalPublisher.java:206)
18   at org.jenkins_ci.plugins.flexible_publish.FlexiblePublisher.perform(FlexiblePublisher.java:124)
19   at hudson.tasks.BuildStepMonitor$3.perform(BuildStepMonitor.java:45)
20   at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:779)
21   at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:720)
22   at hudson.model.Build$BuildExecution.post2(Build.java:186)
23   at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:665)
24   at hudson.model.Run.execute(Run.java:1753)
25   at hudson.matrix.MatrixRun.run(MatrixRun.java:146)
26   at hudson.model.ResourceController.execute(ResourceController.java:98)
27   at hudson.model.Executor.run(Executor.java:405)
28 Caused by: java.io.IOException: remote file operation failed: org.kde.kdevelop.appdata.xml:org.kde.kdevelop.desktop:50 at hudson.remoting.Channel@2f3b5e55:Swarm-86b6c214daa5-10.150.82.1: java.io.FileNotFoundException: org.kde.kdevelop.appdata.xml:org.kde.kdevelop.desktop:50 (No such file or directory)
29   at hudson.FilePath.act(FilePath.java:992)
30   at hudson.FilePath.act(FilePath.java:974)
31   at hudson.FilePath.copyTo(FilePath.java:2005)
32   at hudson.FilePath.copyTo(FilePath.java:1981)
33   ... 21 more
34 Caused by: java.io.FileNotFoundException: org.kde.kdevelop.appdata.xml:org.kde.kdevelop.desktop:50 (No such file or directory)
35   at java.io.FileInputStream.open0(Native Method)
36   at java.io.FileInputStream.open(FileInputStream.java:195)
37   at java.io.FileInputStream.<init>(FileInputStream.java:138)
38   at hudson.FilePath$41.invoke(FilePath.java:2010)
39   at hudson.FilePath$41.invoke(FilePath.java:2005)
40   at hudson.FilePath$FileCallableWrapper.call(FilePath.java:2731)
41   at hudson.remoting.UserRequest.perform(UserRequest.java:153)
42   at hudson.remoting.UserRequest.perform(UserRequest.java:50)
43   at hudson.remoting.Request$2.run(Request.java:336)
44   at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
45   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
46   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
47   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
48   at java.lang.Thread.run(Thread.java:745)
49   at ......remote call to Swarm-86b6c214daa5-10.150.82.1(Native Method)
50   at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1545)
51   at hudson.remoting.UserResponse.retrieve(UserRequest.java:253)
52   at hudson.remoting.Channel.call(Channel.java:830)
53   at hudson.FilePath.act(FilePath.java:985)
54   ... 24 more

Normal run:

Restricted Application added a subscriber: sysadmin. · View Herald TranscriptMar 23 2017, 12:25 AM

/var/lib/jenkins/jobs/kdevelop master kf5-qt5/configurations/axis-PLATFORM/Linux/axis-compiler/gcc/builds/349/workspace- this path does not exist on our CI system... I have no idea where this is pulled from. Are we sure this is not specific to the CI Kevin Funk rolled out?

@scarlettclark The linked issues are from build.kde.org. Unless I missed something Kevin's CI is only on his private server, at kfunk.ddns.net and the port 8080, no?

You're right, ignore me.
The file exists:
root@charlotte ~ # ls /var/lib/jenkins/jobs/kdevelop\ master\ kf5-qt5/configurations/axis-PLATFORM/Linux/axis-compiler/gcc/builds/349/workspace-files/5dfa4733.tmp

kfunk removed a subscriber: kfunk.Mar 23 2017, 3:16 PM

Could this potentially be an encoding issue? (ie. due to locales or something along those lines).
How does what Jenkins generate compare to a local run?

kossebau added a subscriber: apol.Mar 28 2017, 5:29 PM

I've taken a closer look at this. The inability to copy message is completely unrelated to the actual issue here.

The issues being raised by the Appstream test however are genuine as far as it can tell, so I strongly suspect encoding problems here although the builds which are fine are spread across both build hosts so it isn't a host specific issue. It is possible an image rebuild on the responsible host has fixed this.

bcooksley renamed this task from Appstreamcli Test fails sometimes with java.io.IOException (seemingly on copying the appdata file from the workspace to the build folder) to Appstreamcli test results flaky.Mar 29 2017, 6:28 AM
apol added a comment.Apr 7 2017, 11:02 PM

Note there's 2 things at play here:

  • AppStream job on jenkins
  • The appstreamtest that comes from ECM

The big difference between both being:

  • the ECM one works on a per-appdata file basis (rather than a whole installed tree)
  • the ECM will turn the repository yellow if it's not good
  • the jenkins job integrates the issue better
  • the developer can actually test locally the ECM one

Note there's some (few) cases where they could disagree.

The output we're seeing here problems with is the Jenkins one, AFAIU.

HTH

From what I can tell, the issues being flagged by Jenkins here are all legitimate, based on the log output at least?
The file copy errors are unrelated to the issue described and are due to the way the Appstream validator outputs error messages and Jenkins attempts to parse them.

apol added a comment.Apr 8 2017, 12:10 AM

Yes, so IIRC, the problem was that the problems disappeared randomly from a run to another and then popped back in.

I insist, IIRC.

bcooksley closed this task as Resolved.Apr 8 2017, 9:28 PM
bcooksley claimed this task.

I've done some investigation just now and have determined this was a race condition. The CI tooling wasn't waiting for the Appstream validation to complete, leading to the output getting prematurely cut off when the Python scripts reached the end of their execution and Jenkins concluded that the job was successful.

This should now be resolved with https://commits.kde.org/sysadmin/ci-master-config/35b33453ead8dbd2257456d7be99b6ee1d8c4c0c

Good job, thanks, that sounds/looks plausible.