Handle adjacent special characters correctly
ClosedPublic

Authored by bruns on Apr 2 2018, 5:14 PM.

Details

Summary

The code handled sequences like '((' incorrectly, i.e. this was parsed as
a single opening quote, and thus could get operator association wrong.
Although only '>=' and '<=' have a special meaning, also accept '==' and
':=' as '=' resp. ':'.

BUG: 392620

Test Plan
38: QDEBUG : AdvancedQueryParserTest::testNestedParentheses(a OR ((b AND c) AND d))   result term [ OR ( : a (QString)) [ AND ( : b (QString)) ( : c (QString)) ( : d (QString)) ] ]
38: QDEBUG : AdvancedQueryParserTest::testNestedParentheses(a OR ((b AND c) AND d)) expected term [ OR ( : a (QString)) [ AND ( : b (QString)) ( : c (QString)) ( : d (QString)) ] ]
38: PASS   : AdvancedQueryParserTest::testNestedParentheses(a OR ((b AND c) AND d))

Diff Detail

Repository
R293 Baloo
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
bruns created this revision.Apr 2 2018, 5:14 PM
Restricted Application added projects: Frameworks, Baloo. · View Herald TranscriptApr 2 2018, 5:14 PM
Restricted Application added a subscriber: Frameworks. · View Herald Transcript
bruns requested review of this revision.Apr 2 2018, 5:14 PM
michaelh requested changes to this revision.Apr 2 2018, 8:07 PM

It would be great if you could add tests for ==, := and ((...)...) to autotests/unit/lib/advancedqueryparsertest.cpp

src/lib/advancedqueryparser.cpp
35

Suggestion: Rename to tokenList. It's less likely to confuse it with token that way.

47–48

Please use !token.isEmpty()
(Don't exactly know why, but have myself been told so several times. Maybe it's faster. :-)

54–72

Please use QLatin1Char

55

// Parentheses delimit tokens, and are tokens by themselves

61

Please use QLatin1Char

63

Please use !token.isEmpty()

68

Please use QLatin1Char

69

':' means Term::Contains and '=' means Term::Equal so ':=' is a little ambigous.
Unless you have a particular reason to interpret ':=' as '=' we should take it as ':'.
It may be better to have and extra If for '=' and ':' and simply drop any second "=" so we come out with ':' or '=' as token.

The parse() function checks the second char only in case of '<' or '>'. So ':=' will become ':'. The
2-char tokens should to be added to the switch in parse() which then also could be simplified. The distinction between '>' and '>=' is the lexer's job, right?

This revision now requires changes to proceed.Apr 2 2018, 8:07 PM
bruns added inline comments.Apr 2 2018, 9:21 PM
src/lib/advancedqueryparser.cpp
55

Why choose a different wording here? Also, the important aspect here is the "end", which commits the current token to the list.

69

":=" and "==" are added as is, and the parser interprets both dependent on the first character only, i.e. ":" and "=".
The behaviour for these two combinations is unchanged with this patch.

The lexer should not handle ">" and ">=" differently, both are (valid) tokens, and should be returned as such.

Adding e.g. ">=" to the switch statement is not possible, as it works on QChar's.

michaelh added inline comments.Apr 3 2018, 6:16 PM
src/lib/advancedqueryparser.cpp
55

Because of opening and closing parentheses, but you're right 'end' is the important aspect here.

69
  1. I misunderstood your description as ':=' should become '='.

The lexer should not handle ">" and ">=" differently, both are (valid) tokens, and should be returned as such.

That is what I meant. I was confused by parse() lexing again when encountering a '>'.

Adding e.g. ">=" to the switch statement is not possible, as it works on QChar's.

I don't understand. token is QString. Why not instead of switch (token.at(0).toLatin1())

switch (token) {
case '>'`: comparator = Term::Greater; break;
case '>='`: comparator = Term::GreaterEqual; break;

What am I missing?

bruns added inline comments.Apr 4 2018, 4:28 AM
src/lib/advancedqueryparser.cpp
69

I don't understand. token is QString. Why not instead of switch (token.at(0).toLatin1())

switch (token) {
case '>'`: comparator = Term::Greater; break;
case '>='`: comparator = Term::GreaterEqual; break;

What am I missing?

You can only switch on integral statements (ints, chars, ...). With C++11, you can do some constexpr hacks, but your code won't work as is.

michaelh added inline comments.Apr 4 2018, 8:25 AM
src/lib/advancedqueryparser.cpp
69

You can only switch on integral statements (ints, chars, ...).

Heh? That's surprising![1] Anyway, thanks for the lesson a keep this code as it is now.

[1]
Coming from javascript switch definitely is a "false friend".
Can I become a beef steak?

bruns marked 8 inline comments as done.Apr 4 2018, 9:50 PM
michaelh added a comment.EditedApr 5 2018, 9:41 AM

Apart from QLatin1Char and token.isEmpty() we're done, I think.
I asked on IRC about the importance of QLatin1Char the answer was "it depends", QLatin1Char is conservative the code as-is more concise. So, change it or leave it, your decision.

bruns updated this revision to Diff 31550.Apr 7 2018, 3:06 AM

token.size() > 0 -> !token.isEmpty()
renamed tokens to tokenList

Use testcase

bruns marked 6 inline comments as done.Apr 7 2018, 3:07 AM
bruns edited the test plan for this revision. (Show Details)
bruns updated this revision to Diff 31636.Apr 8 2018, 1:01 AM
bruns added a task: Restricted Maniphest Task.

rebase

bruns removed a task: Restricted Maniphest Task.Apr 8 2018, 1:09 AM
michaelh accepted this revision as: michaelh.Apr 8 2018, 10:07 AM
michaelh accepted this revision.
This revision is now accepted and ready to land.Apr 8 2018, 10:07 AM
This revision was automatically updated to reflect the committed changes.