Changeset View
Standalone View
src/parser/KDbSqlScanner.l
Show All 29 Lines | |||||
30 | extern QByteArray globalToken; | 30 | extern QByteArray globalToken; | ||
31 | 31 | | |||
32 | #define YY_NO_UNPUT | 32 | #define YY_NO_UNPUT | ||
33 | #define ECOUNT globalCurrentPos += yyleng; globalToken = yytext | 33 | #define ECOUNT globalCurrentPos += yyleng; globalToken = yytext | ||
34 | 34 | | |||
35 | extern void setError(const QString& errDesc); | 35 | extern void setError(const QString& errDesc); | ||
36 | extern void setError(const QString& errName, const QString& errDesc); | 36 | extern void setError(const QString& errName, const QString& errDesc); | ||
37 | 37 | | |||
38 | /* Only quotes the input if it does not start with a quote character, otherwise | ||||
39 | it would be too hard to read with some fonts. */ | ||||
40 | static QString maybeQuote(const QString& string) | ||||
staniek: { in new line
Use clang-format if unsure:
{F8305516} | |||||
Sorry, did not know i could run clang-format on flex sources. Will use it next time. Do you have the correct .clang-format file, though? I tried clang-format with WebKit-based style but it resulted in massive changes to this file and other C++ sources i tried. Thank you. jfita: Sorry, did not know i could run clang-format on flex sources. Will use it next time.
Do you… | |||||
Sorry, we should have it in the repo, I thing we never added it after splitting repositories. This is what I use at the moment: BasedOnStyle: WebKit Language: Cpp AccessModifierOffset: -4 ConstructorInitializerIndentWidth: 4 AlignEscapedNewlinesLeft: false AlignTrailingComments: false AlignAfterOpenBracket: Align AllowAllParametersOfDeclarationOnNextLine: true AllowShortIfStatementsOnASingleLine: false AllowShortLoopsOnASingleLine: false AllowShortFunctionsOnASingleLine: false AlwaysBreakTemplateDeclarations: false AlwaysBreakBeforeMultilineStrings: false BreakBeforeBraces: Linux BreakBeforeBinaryOperators: true BreakBeforeTernaryOperators: true BreakConstructorInitializersBeforeComma: true BinPackParameters: true ColumnLimit: 100 ConstructorInitializerAllOnOneLineOrOnePerLine: true #DerivePointerBinding: false #ExperimentalAutoDetectBinPacking: false IndentCaseLabels: false MaxEmptyLinesToKeep: 1 NamespaceIndentation: None ObjCSpaceBeforeProtocolList: true PenaltyBreakBeforeFirstCallParameter: 19 PenaltyBreakComment: 60 PenaltyBreakString: 1000 PenaltyBreakFirstLessLess: 120 PenaltyExcessCharacter: 1000000 PenaltyReturnTypeOnItsOwnLine: 60 PointerBindsToType: false SpacesBeforeTrailingComments: 1 Cpp11BracedListStyle: false Standard: Cpp11 IndentWidth: 4 TabWidth: 8 UseTab: Never BreakBeforeBraces: Linux IndentFunctionDeclarationAfterType: false SpacesInParentheses: false SpacesInAngles: false SpaceInEmptyParentheses: false SpacesInCStyleCastParentheses: false #SpaceAfterControlStatementKeyword: true SpaceBeforeAssignmentOperators: true ContinuationIndentWidth: 4 Configure it this way I realized quite recently there was added some automation: https://phabricator.kde.org/D24568 but I'd prefer not to alter so many lines just for formatting. If we look af KF5 repos I think only few publish official clang-format, e.g. kactivities.git/.clang-format. I think settings I pasted above reflect it as well as https://community.kde.org/Policies/Frameworks_Coding_Style not too loosely. Regarding clang-format for flex files, it's not 100% supported but the code we discuss sits between %{ and %} and that seems to get formatted properly using clang-format if you select correct block. staniek:
Sorry, we should have it in the repo, I thing we never added it after splitting repositories. | |||||
41 | { | ||||
42 | QString first(string.left(1)); | ||||
43 | if (first == QLatin1Char('\'') || first == QLatin1Char('"') || first == QLatin1Char('`')) { | ||||
44 | return string; | ||||
45 | } | ||||
46 | return QStringLiteral("\"%1\"").arg(string); | ||||
47 | } | ||||
38 | %} | 48 | %} | ||
39 | 49 | | |||
40 | /* *** Please reflect changes to this file in ../driver_p.cpp *** */ | 50 | /* *** Please reflect changes to this file in ../driver_p.cpp *** */ | ||
41 | 51 | | |||
42 | %option case-insensitive | 52 | %option case-insensitive | ||
43 | %option noyywrap | 53 | %option noyywrap | ||
44 | %option never-interactive | 54 | %option never-interactive | ||
45 | 55 | | |||
▲ Show 20 Lines • Show All 101 Lines • ▼ Show 20 Line(s) | |||||
147 | 157 | | |||
148 | {whitespace_1_line}*(?i:PM) { | 158 | {whitespace_1_line}*(?i:PM) { | ||
149 | ECOUNT; | 159 | ECOUNT; | ||
150 | return TIME_PM; | 160 | return TIME_PM; | ||
151 | } | 161 | } | ||
152 | 162 | | |||
153 | # { | 163 | # { | ||
154 | ECOUNT; | 164 | ECOUNT; | ||
155 | sqlParserDebug() << "### begin DATE_OR_TIME" << yytext << "(" << yyleng << ")"; | 165 | sqlParserDebug() << "### end DATE_OR_TIME" << yytext << "(" << yyleng << ")"; | ||
staniek: Why it is "end" now when we're calling BEGIN() here? | |||||
BEGIN is to switch to a new start condition: after this flex will only look for rules that are within the group DATE_OR_TIME_caller, in this case the INITIAL or empty start condition. That is, it will look all the rules that have *no* prefix, hence this is the end of the date/time “subparser”. In other words, this is the rule of the '#' character that ends the date/time literal. jfita: BEGIN is to switch to a new start condition: after this flex will only look for rules that are… | |||||
staniek: Thanks | |||||
156 | BEGIN(DATE_OR_TIME_caller); | 166 | BEGIN(DATE_OR_TIME_caller); | ||
157 | return '#'; | 167 | return '#'; | ||
158 | } | 168 | } | ||
159 | 169 | | |||
170 | | ||||
Very good, how about also adding a comment here to explain all readons behind this rule, as you nicely did in the description? staniek: Very good, how about also adding a comment here to explain all readons behind this rule, as you… | |||||
171 | . { // fallback rule to avoid flex's default action that prints the character to stdout | ||||
172 | // without notifying the scanner. | ||||
173 | ECOUNT; | ||||
174 | const QString string(QString::fromUtf8(yytext, yyleng)); | ||||
Something easier to translate and understand for end users would be better. E.g. "Unexpected character \"%1\" in date/time" staniek: Something easier to translate and understand for end users would be better. E.g.
"Unexpected… | |||||
The problem i had quoting the unexpected character is that, depending on the font settings, it can be very hard to spot what is the offending character when it is a symbol like ' or ". For example, in the following screenshot i have problems seeing the ' character in "'": Having the character alone at the end removes that kind of problems: Do you want me to keep the quotes around the character? jfita: The problem i had quoting the unexpected character is that, depending on the font settings, it… | |||||
That's a good note. How about doing the same as a compiler shows: when there's a quote it shows just it, e.g:' error: missing terminating " character else, it shows like error: unknown type name 'a' So we could do: const QString string(QString::fromUtf8(yytext, yyleng)); QString char(string.left(1)); if (char == QLatin1Char('\'') || char == QLatin1Char('"') || char == QLatin1Char('`')) { } else { char = QString("\"%1\").arg(char); } setError(KDbParser::tr("Unexpected character %1 in date/time").arg(char)); staniek: That's a good note.
How about doing the same as a compiler shows:
when there's a quote it… | |||||
I’ve put that code into a function so that i could reuse in both error conditions and changed the message to what you suggested. jfita: >
> So we could do:
>
> ```
> const QString string(QString::fromUtf8(yytext, yyleng));
>… | |||||
175 | setError(KDbParser::tr("Unexpected character %1 in date/time").arg(maybeQuote(string))); | ||||
176 | return SCAN_ERROR; | ||||
177 | } | ||||
178 | | ||||
160 | } | 179 | } | ||
161 | /* -- end of DATE_OR_TIME --- */ | 180 | /* -- end of DATE_OR_TIME --- */ | ||
162 | 181 | | |||
163 | ("AND"|"&&") { | 182 | ("AND"|"&&") { | ||
164 | ECOUNT; | 183 | ECOUNT; | ||
165 | return AND; | 184 | return AND; | ||
166 | } | 185 | } | ||
167 | 186 | | |||
▲ Show 20 Lines • Show All 197 Lines • ▼ Show 20 Line(s) | 381 | {query_parameter} { | |||
365 | yylval.stringValue = new QString(QString::fromUtf8(yytext+1, yyleng-2)); | 384 | yylval.stringValue = new QString(QString::fromUtf8(yytext+1, yyleng-2)); | ||
366 | return QUERY_PARAMETER; | 385 | return QUERY_PARAMETER; | ||
367 | } | 386 | } | ||
368 | 387 | | |||
369 | {whitespace}+ { | 388 | {whitespace}+ { | ||
370 | ECOUNT; | 389 | ECOUNT; | ||
371 | } | 390 | } | ||
372 | 391 | | |||
373 | [\~\!\@\#\^\&\|\`\?,()\[\]\.;\:\+\-\*\/\%\^\<\>\=] { | 392 | [\~\!\@\#\^\&\|\`\?,()\[\]\.;\:\+\-\*\/\%\^\<\>\=] { | ||
This is nice but how about keeping the [\~\!\@\#\^\&\|\`\?,()\[\]\.;\:\+\-\*\/\%\^\<\>\=] { ... } which just defines all single-character tokens and adding the . { } afterwards? staniek: This is nice but how about keeping the
```
[\~\!\@\#\^\&\|\`\?,()\[\]\.;\:\+\-\*\/\%\^\<\>\=]… | |||||
Yes, it is indeed way better because now i can return an actual scanning error and the error message that the end user sees is “Unexpected character literal: x”, that is much nicer than the “Query error” i was getting before. Thank you. jfita: Yes, it is indeed way better because now i can return an actual scanning error and the error… | |||||
374 | sqlParserDebug() << "char: '" << yytext[0] << "'"; | 393 | sqlParserDebug() << "char: '" << yytext[0] << "'"; | ||
375 | ECOUNT; | 394 | ECOUNT; | ||
376 | return yytext[0]; | 395 | return yytext[0]; | ||
377 | } | 396 | } | ||
378 | 397 | | |||
398 | . { // fallback rule to avoid flex's default action that prints the character to stdout | ||||
399 | // without notifying the scanner. | ||||
400 | ECOUNT; | ||||
401 | const QString string(QString::fromUtf8(yytext, yyleng)); | ||||
402 | setError(KDbParser::tr("Unexpected character %1").arg(maybeQuote(string))); | ||||
403 | return SCAN_ERROR; | ||||
404 | } | ||||
405 | | ||||
379 | %% | 406 | %% | ||
380 | 407 | | |||
381 | void tokenize(const char *data) | 408 | void tokenize(const char *data) | ||
382 | { | 409 | { | ||
383 | yy_switch_to_buffer(yy_scan_string(data)); | 410 | yy_switch_to_buffer(yy_scan_string(data)); | ||
384 | globalToken.clear(); | 411 | globalToken.clear(); | ||
385 | globalCurrentPos = 0; | 412 | globalCurrentPos = 0; | ||
386 | } | 413 | } | ||
387 | 414 | |
{ in new line
Use clang-format if unsure: