When creating lists of known sizes for types whose instances are set up each
by initially creating a default contructed one, which then is further
modified before being appended by value to the list, it should be quicker to
instead fill the list in one go with all default constructed instances and
then walk over each for the further modification.
Details
- Reviewers
aaronpuchert brauch - Group Reviewers
KDevelop - Commits
- R32:4b1484607578: Fill lists of default-constructed types directly, not append any by value
Diff Detail
- Repository
- R32 KDevelop
- Lint
Automatic diff as part of commit; lint not applicable. - Unit
Automatic diff as part of commit; unit tests not applicable.
Not exactly sure if readability here is not reduced, especially when it comes to list-filling-code patterns people might be used to.
So more a RFC of some minimal (unmeasured) performance improvement code I had found possible when looking at the QList->QVector stuff, asking whether you are fine with such code in KDevelop or prefer the current pattern for (which?) reason.
It saves a copy constructor and destructor call per list item.
Old code:
alloc heap memory for list items for each item default constructor item on stack setup item on stack copy constructor for item on heap from item on stack destructor item on stack
New code:
alloc heap memory for list items for each item default constructor item on heap for each item setup item on heap
It might not actually be faster though, because you go through the memory twice. If the vector doesn't fit into the cache, that could introduce quite some slowdown.
On the other hand, copying deep objects can obviously be costly. Some of the vector element types seem to carry around some baggage.
The C++11 way would be to move such objects, can we do that? It seems to be supported via QVector<T>::push_back(T&& t).
Edit: Tested it for plugins/quickopen/projectfilequickopen.cpp, and it compiles. It could be faster, because ProjectFiles contain two KDevelop::Paths, which is essentially a QVector<QString>, which has a move constructor. However, since the Paths have a user-declared copy constructor, the move constructor has to be explicitly activated using = default.
Edit: This should also do the job
--- a/kdevplatform/util/path.h +++ b/kdevplatform/util/path.h @@ -122,7 +122,7 @@ public: * * @sa addPath() */ - Path(const Path& base, const QString& subPath = QString()); + Path(const Path& base, const QString& subPath); /** * Equality comparison between @p other and this Path.
By the way, it's usually a good idea to have vector elements nothrow-move constructible. This can be checked with
static_assert(std::is_nothrow_move_constructible<Type>::value, "...");
for a QVector<T>. This doesn't always seem to be the case. Maybe I'll have a look what can be done there.
I see. (Thanks for reply, really interested in this but never looked at details in the last years, so happy about input and discussion. Please bear with me being naive and uninformed for now)
On the other hand, copying deep objects can obviously be costly. Some of the vector element types seem to carry around some baggage.
The C++11 way would be to move such objects, can we do that? It seems to be supported via QVector<T>::push_back(T&& t).
I still have to complete my C++11 classes sadly. In my naive mind I would have hoped something like this would be possible, in general:
alloc heap memory for list items for each item default constructor item on heap setup item on heap
I would not see any principal need to prepare the item on the stack. Is there? Why would C++11 still prefer a move operator, instead of going directly for the final memory destination?
With QVector it seems there is some practical need for a separate item instance only because the API does not have any hooks for some iterator-based filling, any item construction is hardcoded to the default constructor (Edit: besides the append() logic which does copy/move constructor, which though does not gain us something for avoiding the temp copy on the stack). One could imagine some fill(int count, functor constructor) or some T& appendNew(). But it isn't there after all the years, despite list filling from a loop surely being a common need. Would that be an anti-pattern for some reason?
No real clue yet about move operators, but how does it spare us here any constructor and deconstructor calls? And with the QType-sub-datastructures being implicitly-shared usually, does the move operator make a difference when it comes to unneeded temporary allocations?
You are completely right, I would just advise against going through the vector twice, as you'd do with resize(). So if you still use reserve(), then add single elements and initialize them that's fine.
With QVector it seems there is some practical need for a separate item instance only because the API does not have any hooks for some iterator-based filling, any item construction is hardcoded to the default constructor (Edit: besides the append() logic which does copy/move constructor, which though does not gain us something for avoiding the temp copy on the stack). One could imagine some fill(int count, functor constructor) or some T& appendNew(). But it isn't there after all the years, despite list filling from a loop surely being a common need. Would that be an anti-pattern for some reason?
STL's std::vector has emplace_back that constructs a new vector element in place. Sadly QVector doesn't have it yet, and I've seen Qt developers express their doubts it's going to happen soon.
What I was trying to say is that compilers have a lot of freedom with local variables - they might not ever be written to actual memory, sometimes not even into registers. The only requirement imposed is that the observed behavior is not allowed to change. This is why debugging optimized code can be so hard. Eliminating copies of local variables may not actually improve performance.
But even if the compiler is not able to optimize out copies, all that stuff is still happening in the L1 cache. Drastic performance degradation usually happens when there is too much data and stuff gets evicted to main memory. Hence it's preferable to do everything in one pass.
No real clue yet about move operators, but how does it spare us here any constructor and deconstructor calls? And with the QType-sub-datastructures being implicitly-shared usually, does the move operator make a difference when it comes to unneeded temporary allocations?
The benefit for implicitly-shared data types might be small, but I think not all types are. So maybe I should clarify that moves might not always be cheaper than copies (for simple data types they do the same), but sometimes there is a considerable benefit, usually when types own (meaning it's not shared) a chunk of data on the heap.
Well, "QVector still doesn't move elements on reallocations."
Thanks for detailed reply @aaronpuchert Does not conflict with what I heard elsewhere, so big picture got more clear to me.
So my take-away is: default to one-pass for best practice, unless actual measurements are done.
So will discard this patch or, well, reduce it to the changes which are still one-pass afterwards (the ones where no further modification besides default constructor is done).
reduce patch to those single-pass list filling where default constructed
value is added