KSysGuard Network Plugin: Don't use std::regex to parse the network files.
Needs RevisionPublic
Actions

Authored by dkorth on May 17 2020, 5:15 PM.

Details

Reviewers

Summary

std::regex is slow, so use fixed-field parsing instead.

This is an initial version that I've been using for a while with no major issues, though it might still need some improvements. CPU usage is significantly decreased compared to before when there is a lot of network activity.

Test Plan

Run programs that generate a lot of network activity.
Verify the network activity shows up correctly with this patch.

Diff Detail

Repository

R106 KSysguard

Lint

Lint Skipped

Unit

Unit Tests Skipped

dkorth created this revision.May 17 2020, 5:15 PM

Restricted Application added a project: Plasma. · View Herald TranscriptMay 17 2020, 5:15 PM

Restricted Application added a subscriber: plasma-devel. · View Herald Transcript

dkorth requested review of this revision.May 17 2020, 5:15 PM

jpalecek added a subscriber: jpalecek.May 17 2020, 5:45 PM

Hmm, this is good idea, I'm not really happy about the implementation though. It's all C code whereas the rest of the helper is C++. It also relies very heavily on magic numbers now.

I think a much simpler implementation would be to split each line on " ", select the fields we want and clean them up.

This revision now requires changes to proceed.May 18 2020, 11:58 AM

It's all C code whereas the rest of the helper is C++. It also relies very heavily on magic numbers now.

I think a much simpler implementation would be to split each line on " ", select the fields we want and clean them up.

I assume this is for performance reasons, but a tiny microbenchmark showing that it is actually faster would be nice.

sandsmark added inline comments.May 20 2020, 8:18 AM

plugins/process/network/helper/ConnectionMapping.cpp
157	if we're going for ultra optimized, const everything I guess.
167	instead of 87 + 1, maybe have a `constexpr int lineLength = strlen("0: 00000000000000000000000001000000:0277 00000000000000000000000000000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 16741");` above with the comment? or: constexpr const char *exampleLine = "0: 00000000000000000000000001000000:0277 00000000000000000000000000000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 16741"; constexpr int lineLength = strlen(exampleLine); both more understandable and less prone to accidentally putting in the wrong number. same with all other magic string offset numbers here.
176	strtoul stops at the first non-digit, so is this really necessary?
203	I prefer the more c++-ish cast of `uint32_t(foo);` (I think I got that from the Qt code style guidelines).

sandsmark added inline comments.May 20 2020, 8:22 AM

plugins/process/network/helper/ConnectionMapping.cpp
167	To see that it is actually optimized properly: https://godbolt.org/z/sL2_pK

In D29808#673157, @sandsmark wrote:

It's all C code whereas the rest of the helper is C++. It also relies very heavily on magic numbers now.

I think a much simpler implementation would be to split each line on " ", select the fields we want and clean them up.

I assume this is for performance reasons, but a tiny microbenchmark showing that it is actually faster would be nice.

Hi! Just out of curiosity (I'm not the OP), I created a microbenchmark here: https://github.com/jpalecek/ksysguard-network-microbenchmark. As a fun project, I added a Boost.Sphinx implementation, which was remarkably easy. Not the most readable code ever, beware. Some bugs were found while doing this (in both old and new implementation), see the code.

plugins/process/network/helper/ConnectionMapping.cpp
167	Actually I don't agree with that view. I don't think `lineLength = strlen("0: 00000000000000000000000...`) `is readable and less error prone than` 87`, only longer by ~100 characters and a nightmare to check. With numbers, I can check with an editor that shows column numbers and simple arithmetic, but comparing such long strings made of spaces is hard. If you want more robustness, I would suggest using the fact that the number columns are aligned with the header and using something like `inode_col = header.find("inode")`. Or counting the fields. That would also take care of the possibility that the columns may not be same width across architectures (?)

The original version I wrote a few months back as a quick fix to reduce CPU usage. I do agree that the magic numbers shouldn't be hard-coded, since that's not easily maintained, though I'm not sure of the best approach. constexpr strlen() seems like it'd work, or alternatively, splitting the line on spaces.

I'll look into rewriting it to be more robust and submit an update.

Revision Contents
Changeset List

			Path	Packages
M			plugins/process/network/helper/Accumulator.cpp (3 lines)
M			plugins/process/network/helper/ConnectionMapping.h (6 lines)
M			plugins/process/network/helper/ConnectionMapping.cpp (110 lines)

Diff 83020

View Options

plugins/process/network/helper/Accumulator.cpp

View Options

plugins/process/network/helper/ConnectionMapping.h

View Options

plugins/process/network/helper/ConnectionMapping.cpp

Diff	ID	Base	Description	Created	Lint	Unit
Base			Base
Diff 1	83020			May 17 2020, 5:12 PM	★	★

KSysGuard Network Plugin: Don't use std::regex to parse the network files.Needs RevisionPublicActions

Details

Diff Detail

Revision ContentsChangeset List

Diff 83020

plugins/process/network/helper/Accumulator.cpp

plugins/process/network/helper/ConnectionMapping.h

plugins/process/network/helper/ConnectionMapping.cpp

KSysGuard Network Plugin: Don't use std::regex to parse the network files.
Needs RevisionPublic
Actions

Revision Contents
Changeset List