WordDetect rule: detect delimiters at the inner edge of the string
ClosedPublic
Actions

Authored by nibags on Oct 3 2019, 6:24 AM.

Details

Reviewers

dhaumann
cullmann
vkrause
jpoelen

Group Reviewers

Framework: Syntax Highlighting

Commits

R216:61988aa5a1dd: WordDetect rule: detect delimiters at the inner edge of the string

Summary

In WordDetect rules, verify delimiter characters also on the right and left edges inside the string.

For example:

<WordDetect attribute="Keyword" String="<hello"/>

In the past, this rule was equivalent to \b<hello\b in regular expression. Now, it's equivalent to <hello\b, since < is a delimiter character.

I have checked the WordDetect rules in all definitions and and I haven't seen regressions. In the definitions elm.xml, selinux-cil.xml and selinux-fc.xml I replaced some WordDetect rules with StringDetect, since in this change they are equivalent.

Test Plan

make test

Diff Detail

Repository

R216 Syntax Highlighting

Branch

fix-worddetect

Lint

No Linters Available

Unit

No Unit Test Coverage

Build Status

Buildable 17275
Build 17293: arc lint + arc unit

nibags created this revision.Oct 3 2019, 6:24 AM

Restricted Application added projects: Kate, Frameworks. · View Herald TranscriptOct 3 2019, 6:24 AM

Restricted Application added subscribers: kde-frameworks-devel, kwrite-devel. · View Herald Transcript

nibags requested review of this revision.Oct 3 2019, 6:24 AM

Harbormaster completed remote builds in B17274: Diff 67237.Oct 3 2019, 6:24 AM

nibags mentioned this in D24354: Mustache/Handlebars: minor fixes.Oct 3 2019, 6:27 AM

Add comment

Harbormaster completed remote builds in B17275: Diff 67238.Oct 3 2019, 6:47 AM

This looks good to me and as mentioned in D24354 WordDetect is better than RegExpr.

+1, but I'd like another review by @cullmann, @jpoelen or @vkrause.

Seems reasonable, do we need some doc updates? Or some more verbose description in the XSD?

cullmann accepted this revision.Oct 3 2019, 1:30 PM

This revision is now accepted and ready to land.Oct 3 2019, 1:30 PM

I think it's fine as is. The docbook says:

Detect an exact string but additionally require word boundaries
such as a dot <userinput>'.'</userinput> or a whitespace on the beginning
and the end of the word. Think of <userinput>\b&lt;string&gt;\b</userinput>
in terms of a regular expression, but it is faster than the rule <userinput>RegExpr</userinput>.

Imo <userinput>\b<string>\b</userinput> implies that if a string itself starts/ends with a \b character, then this should match as well. And given our unit tests do not show any changes, I think we are good to go.

Please commit.

nibags closed this revision.Oct 4 2019, 3:13 AM

Showing Only Differences

This revision modifies 8 more files that are hidden because they were not modified between selected diffs and they have no inline comments.

Revision Contents
Changeset List

			Path	Packages
M			src/lib/rule.cpp (8 lines)

Diff	ID	Base	Description	Created	Lint	Unit
Base			Base
Diff 1	67237	1873102		Oct 3 2019, 6:24 AM	★	★
Diff 2	67238	1873102	- Add comment	Oct 3 2019, 6:47 AM	★	★

Commit	Tree	Parents	Author	Summary	Date
3ce5b91b6aa8	c46af8b68a07	45ae0bc6c41a	Nibaldo González	Add comment	Oct 3 2019, 6:47 AM
45ae0bc6c41a	b4d7019f3610	1873102570c5	Nibaldo González	WordDetect rule: detect delimiters at the inner edge of the string (Show More…)	Oct 3 2019, 6:21 AM

Diff 67238

View Options