RESOLVED FIXED Bug 170099
Web Inspector: Add regular expression support to XHR breakpoints
https://bugs.webkit.org/show_bug.cgi?id=170099
Summary Web Inspector: Add regular expression support to XHR breakpoints
Matt Baker
Reported 2017-03-25 09:36:32 PDT
Summary: Add regular expression support to XHR breakpoints. Tree element titles for XHR breakpoints should be as follows: URL – "foo" URL – /foo\d/ The popover should present this option as a <select> picker preceding the input field, with options "Contains" and "Matches".
Attachments
[Image] Popover with new regex option (225.62 KB, image/png)
2017-04-17 22:45 PDT, Matt Baker
no flags
[Image] Tree element w/ regex subtitle (67.45 KB, image/png)
2017-04-17 22:46 PDT, Matt Baker
no flags
[Image] CodeMirror mode for MIME type "text/x-regex" (26.15 KB, image/png)
2017-04-17 22:48 PDT, Matt Baker
no flags
Patch (42.03 KB, patch)
2017-04-17 23:13 PDT, Matt Baker
no flags
[Image] improved character set tokenizer (25.69 KB, image/png)
2017-04-19 12:29 PDT, Matt Baker
no flags
[Image] improved subexpression nesting (24.46 KB, image/png)
2017-04-19 12:36 PDT, Matt Baker
no flags
Patch (44.64 KB, patch)
2017-04-19 13:17 PDT, Matt Baker
no flags
Patch (45.29 KB, patch)
2017-04-20 02:05 PDT, Matt Baker
no flags
Patch (45.01 KB, patch)
2017-04-20 12:54 PDT, Matt Baker
no flags
[Image] improved character set tokenizer 2 (27.32 KB, image/png)
2017-04-20 13:03 PDT, Matt Baker
no flags
Radar WebKit Bug Importer
Comment 1 2017-04-11 09:39:57 PDT
Matt Baker
Comment 2 2017-04-17 22:45:00 PDT
Created attachment 307346 [details] [Image] Popover with new regex option
Matt Baker
Comment 3 2017-04-17 22:46:19 PDT
Created attachment 307347 [details] [Image] Tree element w/ regex subtitle
Matt Baker
Comment 4 2017-04-17 22:48:41 PDT
Created attachment 307348 [details] [Image] CodeMirror mode for MIME type "text/x-regex" The colors are (loosely) based off TextMate's regex find-text field.
Matt Baker
Comment 5 2017-04-17 23:13:48 PDT
Joseph Pecoraro
Comment 6 2017-04-18 01:14:53 PDT
Comment on attachment 307349 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=307349&action=review Cool! Some quick comments tonight on the regex syntax highlighting. > Source/WebInspectorUI/UserInterface/Views/CodeMirrorRegexMode.js:32 > + if (stream.match(/[dDwWsStrnvf0]/)) > + return "regex-special"; You'll need to add [bB] to your list for word boundaries: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp#boundaries I notice this handles \0 (NUL) but you don't have anything special for the \1-\9 backreferences. https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp#grouping-back-references I think you probably want a special category, or just regex-special. if (stream.match(/[1-9]/)) return "regex-backreference"; > Source/WebInspectorUI/UserInterface/Views/CodeMirrorRegexMode.js:61 > + if (ch === "(") { > + stream.match(/\?:/); > + return "regex-group"; I see this groups /(abc)/ and non-captured groups /(?:abc/)/. You may want special productions for positive lookahead /(?=abc)/ and negative lookahead /(?!=abc)/: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp#assertions > Source/WebInspectorUI/UserInterface/Views/CodeMirrorRegexMode.js:68 > + if (/[\*\+\?]/.test(ch)) The escapes are not necessary inside character sets. I'll leave the style up to you. > Source/WebInspectorUI/UserInterface/Views/CodeMirrorRegexMode.js:89 > + if (ch === "[") { > + stream.match(/.*]/); This doesn't properly handle a character set with a ']' character inside it: /before[abc\]def]after/ Probably not very common. Also, I see a lot of editors specially handle the negation case [^abc] by highlighting the "^" specially. Maybe we could do something there. > Source/WebInspectorUI/UserInterface/Views/CodeMirrorRegexMode.js:94 > + if (/[\.\^\$\|]/.test(ch)) Same thing here. Don't need to escape in the character set (as long as the ^ is not the first!). But I'll leave that style up to you.
Joseph Pecoraro
Comment 7 2017-04-18 01:28:41 PDT
Comment on attachment 307349 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=307349&action=review I haven't completed review but I got carried away looking at it more tonight. This looks great! > Source/JavaScriptCore/inspector/protocol/DOMDebugger.json:62 > + { "name": "isRegEx", "type": "boolean", "optional": true, "descriprion": "Whether the URL string is a regular expression." } We have existing spelling in the protocol of: "isRegex" for a boolean and "regexp" for RemoteObject subtype. Lets be consistent here with "isRegex" > Source/WebCore/inspector/InspectorDOMDebuggerAgent.h:103 > - HashSet<String> m_xhrBreakpoints; > + HashMap<String, bool> m_xhrBreakpoints; An enum would be much clearer than a boolean. > Source/WebInspectorUI/UserInterface/Models/XHRBreakpoint.js:81 > + Text: "xhr-breakpoint-type-text", > + RegularExpression: "xhr-breakpoint-type-regular-expression", Since we serialize these I wonder if shorter is better. Just "text" and "regex". > Source/WebInspectorUI/UserInterface/Views/CodeMirrorRegexMode.css:2 > + * Copyright (C) 2013 Apple Inc. All rights reserved. 2017. > Source/WebInspectorUI/UserInterface/Views/XHRBreakpointPopover.js:42 > + Style: // Public > Source/WebInspectorUI/UserInterface/Views/XHRBreakpointPopover.js:73 > + addOption(WebInspector.UIString("Containing"), WebInspector.XHRBreakpoint.Type.Text); > + addOption(WebInspector.UIString("Matching"), WebInspector.XHRBreakpoint.Type.RegularExpression); I think something in the UI must make it clear that in the matching case the user is inputting a regular expression. > Source/WebInspectorUI/UserInterface/Views/XHRBreakpointPopover.js:89 > + Style: // Private > Source/WebInspectorUI/UserInterface/Views/XHRBreakpointTreeElement.js:41 > + if (breakpoint.type === WebInspector.XHRBreakpoint.Type.Text) > + subtitle = doubleQuotedString(breakpoint.url); > + else > + subtitle = "/" + breakpoint.url + "/"; I wonder if we should color regex the usual /orange/ and strings the usual "red" like they appear in the console (using the formatted value colors). Might be too distracting?
Matt Baker
Comment 8 2017-04-18 14:53:50 PDT
Comment on attachment 307349 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=307349&action=review >> Source/WebInspectorUI/UserInterface/Views/CodeMirrorRegexMode.js:32 >> + return "regex-special"; > > You'll need to add [bB] to your list for word boundaries: > https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp#boundaries > > I notice this handles \0 (NUL) but you don't have anything special for the \1-\9 backreferences. > https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp#grouping-back-references > > I think you probably want a special category, or just regex-special. > > if (stream.match(/[1-9]/)) > return "regex-backreference"; Will add "regex-backreference", and style them identically to regex-special (for now). >> Source/WebInspectorUI/UserInterface/Views/CodeMirrorRegexMode.js:61 >> + return "regex-group"; > > I see this groups /(abc)/ and non-captured groups /(?:abc/)/. > > You may want special productions for positive lookahead /(?=abc)/ and negative lookahead /(?!=abc)/: > https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp#assertions Will add "regex-lookaround" for these two cases, and color them identically to regex-group (for now). >> Source/WebInspectorUI/UserInterface/Views/XHRBreakpointPopover.js:73 >> + addOption(WebInspector.UIString("Matching"), WebInspector.XHRBreakpoint.Type.RegularExpression); > > I think something in the UI must make it clear that in the matching case the user is inputting a regular expression. The input field placeholder text makes this clear when the dialog appears: Break on request with URL: [Containing] Text__________________ [Matching ] Regular Expression__ >> Source/WebInspectorUI/UserInterface/Views/XHRBreakpointTreeElement.js:41 >> + subtitle = "/" + breakpoint.url + "/"; > > I wonder if we should color regex the usual /orange/ and strings the usual "red" like they appear in the console (using the formatted value colors). Might be too distracting? I tried this early on and found it pretty distracting.
Matt Baker
Comment 9 2017-04-19 12:29:48 PDT
Created attachment 307497 [details] [Image] improved character set tokenizer The character set tokenizer has been improved, and now supports the following close bracket scenarios: []] [^]] [abc\]def]
Matt Baker
Comment 10 2017-04-19 12:33:38 PDT
And it produces a new "regex-character-set-negate" style token for a leading "^". So the following regular expression: [^abc] will produce the token stream: [regex-character-set, regex-character-set-negate, regex-character-set]
Matt Baker
Comment 11 2017-04-19 12:36:57 PDT
Created attachment 307498 [details] [Image] improved subexpression nesting Nested capturing/non-capturing groups and lookheads are handled by a context stack, so they get styled correctly too.
Matt Baker
Comment 12 2017-04-19 13:17:36 PDT
Joseph Pecoraro
Comment 13 2017-04-19 15:18:50 PDT
Comment on attachment 307502 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=307502&action=review r=me > Source/WebCore/inspector/InspectorDOMDebuggerAgent.cpp:412 > + int matchesCount = ContentSearchUtilities::countRegularExpressionMatches(regex, url); > + if (matchesCount) { We can just check for a single match and not run the regex multiple times: bool hasMatch = regex.match(url) != -1; if (hasMatch) { > Source/WebCore/inspector/InspectorDOMDebuggerAgent.h:109 > + enum XHRBreakpointType { > + Text = 0, > + RegularExpression, > + }; > + > + HashMap<String, XHRBreakpointType> m_xhrBreakpoints; Nit: enum class (I'm pretty sure that works here, you might need to give it a type tho) and you can just make it one line. > Source/WebInspectorUI/UserInterface/Views/CodeMirrorRegexMode.css:39 > + color: var(--text-color-gray-medium); Maybe create a new syntax-highlight-regex-group-color? > Source/WebInspectorUI/UserInterface/Views/CodeMirrorRegexMode.js:36 > + // An unescaped closing bracket is allowed if it immediately follows the > + // opening bracket, or the negating caret. I mis-interpreted "an unescaped closing bracket is allowed" to mean this would be the equivalent of a closing bracket. I'd suggest rewriting this to something like: // Check for early closing bracket to close the character set. // We are guaranteed not to have an escape character this early. > Source/WebInspectorUI/UserInterface/Views/CodeMirrorRegexMode.js:83 > + if (stream.next()) > + return "regex-literal"; We should return an error if there is no character left. So after this I'd expect a return "error". js> new RegExp("\\") Exception: SyntaxError: Invalid regular expression: \ at end of pattern > Source/WebInspectorUI/UserInterface/Views/XHRBreakpointTreeElement.js:41 > + subtitle = "/" + breakpoint.url + "/"; Hmm, if we show it with /s we might want to consider escaping /s in the regex, which are likely to be common in regexs. But I suspect this won't be a problem so I think keeping it as is is fine for now.
Joseph Pecoraro
Comment 14 2017-04-19 15:22:58 PDT
(In reply to Matt Baker from comment #9) > Created attachment 307497 [details] > [Image] improved character set tokenizer > > The character set tokenizer has been improved, and now supports the > following close bracket scenarios: > > []] I can't find anything it matches with. In fact: /[]a/.test("a") // false /[]a/.test("[]a") // false I have no idea what happens when there is a "[]" in a regex.
Joseph Pecoraro
Comment 15 2017-04-19 15:51:22 PDT
Comment on attachment 307502 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=307502&action=review > Source/WebInspectorUI/UserInterface/Views/CodeMirrorRegexMode.js:44 > + if (stream.eat("\\")) { > + stream.next(); > + return "regex-character-set"; > + } Also, apparently \w \d and friends are still special in character sets: [\d] === [0-9] So you could give those a regex-special appearance.
Matt Baker
Comment 16 2017-04-19 20:36:25 PDT
(In reply to Joseph Pecoraro from comment #14) > (In reply to Matt Baker from comment #9) > > Created attachment 307497 [details] > > [Image] improved character set tokenizer > > > > The character set tokenizer has been improved, and now supports the > > following close bracket scenarios: > > > > []] > > I can't find anything it matches with. In fact: > > /[]a/.test("a") // false > /[]a/.test("[]a") // false > > I have no idea what happens when there is a "[]" in a regex. []] is equivalent to [\]].
Matt Baker
Comment 17 2017-04-19 20:44:16 PDT
Comment on attachment 307502 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=307502&action=review >> Source/WebInspectorUI/UserInterface/Views/CodeMirrorRegexMode.css:39 >> + color: var(--text-color-gray-medium); > > Maybe create a new syntax-highlight-regex-group-color? Sounds good. Initially I considered doing a few --syntax-highlight-regex-*-color vars, but didn't want to get too carried away. This one makes sense to me, and we could add more later if needed. >> Source/WebInspectorUI/UserInterface/Views/CodeMirrorRegexMode.js:44 >> + } > > Also, apparently \w \d and friends are still special in character sets: [\d] === [0-9] > > So you could give those a regex-special appearance. Currently regex-special is purple (--syntax-highlight-color-boolean), which is visually pretty close to red (regex-character-set) when the characters are adjacent. I think it's still worthwhile producing more precise tokens, to make future customization easier.
Joseph Pecoraro
Comment 18 2017-04-19 22:21:02 PDT
> []] is equivalent to [\]]. That does not appear to be the case: /[]]/.test("]]]]]") // => false In any case: [] [^] Are both valid, complete, character sets: https://tc39.github.io/ecma262/#prod-CharacterClass > CharacterClass:: > '[' [lookahead ∉ { '^' }] ClassRanges ']' > '[' '^' ClassRanges ']' > > ClassRanges[U]:: > [empty] > NonemptyClassRanges I just think the comment about this section was super weird.
Matt Baker
Comment 19 2017-04-20 02:05:04 PDT
Matt Baker
Comment 20 2017-04-20 12:31:50 PDT
(In reply to Joseph Pecoraro from comment #18) > > []] is equivalent to [\]]. > > That does not appear to be the case: > > /[]]/.test("]]]]]") // => false > > In any case: > > [] > [^] > > Are both valid, complete, character sets: > https://tc39.github.io/ecma262/#prod-CharacterClass My mistake, I was looking at a PCRE reference. Will update character set tokenizer to match the spec.
Matt Baker
Comment 21 2017-04-20 12:54:33 PDT
Matt Baker
Comment 22 2017-04-20 13:03:03 PDT
Created attachment 307621 [details] [Image] improved character set tokenizer 2 Escape sequences inside character sets now have their own styling.
WebKit Commit Bot
Comment 23 2017-04-20 16:09:57 PDT
Comment on attachment 307620 [details] Patch Clearing flags on attachment: 307620 Committed r215584: <http://trac.webkit.org/changeset/215584>
WebKit Commit Bot
Comment 24 2017-04-20 16:09:59 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.