Bug 306492
| Summary: | WebExtension i18n named placeholders fail when adjacent to non-space characters | ||
|---|---|---|---|
| Product: | WebKit | Reporter: | Brian Birtles <brian> |
| Component: | WebKit Extensions | Assignee: | Nobody <webkit-unassigned> |
| Status: | RESOLVED FIXED | ||
| Severity: | Normal | CC: | ahmad.saleem792, brian, i.am.kanaru.sato, timothy, webkit-bug-importer |
| Priority: | P2 | Keywords: | InRadar |
| Version: | WebKit Nightly Build | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| See Also: | https://bugs.webkit.org/show_bug.cgi?id=291956 | ||
Brian Birtles
Named placeholder substitution in WebExtensions localization (`i18n.getMessage`) fails when the placeholder is immediately preceded by a non-space character.
Example:
```
"message": "Data v$version$"
```
does not substitute $version$, whereas:
```
"message": "Data v $version$"
```
does.
This appears to be due to [this regex](https://github.com/WebKit/WebKit/blob/3a0d87473c4e6ed8a91d6a8df015cb777743f74f/Source/WebKit/Shared/Extensions/WebExtensionLocalization.cpp#L296):
```
auto localizableStringRegularExpression = JSC::Yarr::RegularExpression("(?:[^$]|^)(\\$([A-Za-z0-9_@]+)\\$)"_s, { });
```
The way we use it is to look for the `index` and then, it the first character is a literal space, drop it:
```
if (originalKey.startsWith(' '))
originalKey = localizedString.substring(index + 1, matchLength - 1);
```
Previously (probably before converting to C++ in https://github.com/WebKit/WebKit/commit/944567bb72da80b8a98ac4aac76f70b700373989) it was possible to workaround this by using a U+200B character before the initial $, e.g. "Data v\u200b$version$" but that workaround no longer works.
This might seem like an edge case but for string in languages that don't use spaces to separate works (e.g. CJK languages) this comes up _everywhere_.
Chromium browsers and Gecko don't have this bug.
| Attachments | ||
|---|---|---|
| Add attachment proposed patch, testcase, etc. |
Radar WebKit Bug Importer
<rdar://problem/169146196>
Brian Birtles
Even worse, it has trouble with placeholders that are separated by only a single character and a space, e.g. "$index$( $volume$巻 $page$頁 )"
Due to [this line](https://github.com/WebKit/WebKit/blob/3a0d87473c4e6ed8a91d6a8df015cb777743f74f/Source/WebKit/Shared/Extensions/WebExtensionLocalization.cpp#L350):
```
index += replacement.length() + 2;
```
It will start scanning _two_ characters after the replacement even though at this point we're dealing with positional arguments which only have _one_ dollar sign.
As a result, in the string "$index$( $volume$巻 $page$頁 )", $index$ and $page$ will be replaced but $volume$ won't.
Brian Birtles
> It will start scanning _two_ characters after the replacement even though at this point we're dealing with positional arguments which only have _one_ dollar sign.
On further thought, I have no idea why it adds 2. I think 1 was to cover the first group in the regex, "(?:[^$]|^)" despite the fact that when matching at the start of the string that group will have length 0, but I've no idea why it adds 2.
I'd be glad to fix this up if it's possible to work on this without having access to a Mac and if someone can point me to where tests for this live.
Kanaru Sato
I'd like to work on it :) Will submit a patch soon.
Kanaru Sato
submitted: https://github.com/WebKit/WebKit/pull/62313
Timothy Hatcher
(In reply to Brian Birtles from comment #3)
> > It will start scanning _two_ characters after the replacement even though at this point we're dealing with positional arguments which only have _one_ dollar sign.
>
> On further thought, I have no idea why it adds 2. I think 1 was to cover the
> first group in the regex, "(?:[^$]|^)" despite the fact that when matching
> at the start of the string that group will have length 0, but I've no idea
> why it adds 2.
>
> I'd be glad to fix this up if it's possible to work on this without having
> access to a Mac and if someone can point me to where tests for this live.
The +2 was likely trying to account for the two dollar signs in `$name$` style matches, but positional uses `$N` (one dollar sign, no trailing). This caused the scanner to jump past the next positional placeholder when replacements were short — exactly the bug in comment #2.
EWS
Committed 311685@main (bf4fcdebe481): <https://commits.webkit.org/311685@main>
Reviewed commits have been landed. Closing PR #62313 and removing active labels.