Bug 177327 - Implement overflow-wrap:break-spaces value
Summary: Implement overflow-wrap:break-spaces value
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: CSS (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Javier Fernandez
URL:
Keywords:
Depends on: 183258
Blocks:
  Show dependency treegraph
 
Reported: 2017-09-21 15:39 PDT by Javier Fernandez
Modified: 2018-05-16 09:06 PDT (History)
5 users (show)

See Also:


Attachments
Patch (6.84 KB, patch)
2018-05-16 08:58 PDT, Javier Fernandez
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Javier Fernandez 2017-09-21 15:39:20 PDT
The CSS Text 3 spec defines the 'overflow-wrap' property as property:

https://drafts.csswg.org/css-text-3/#overflow-wrap-property

"Value: 	normal | break-word || break-spaces"
Comment 1 Javier Fernandez 2017-09-27 04:21:02 PDT
I'm asking Blink and Gecko engineers about the plans to implement this feature. It'd be great to have an idea about the WebKit position and its level of support to implement this.
Comment 2 Myles C. Maxfield 2017-10-03 12:44:38 PDT
(In reply to Javier Fernandez from comment #1)
> I'm asking Blink and Gecko engineers about the plans to implement this
> feature. It'd be great to have an idea about the WebKit position and its
> level of support to implement this.

It would be really great if we could rely on ICU more to handle more cases of line breaking. As I understand it, ICU aspires to handle all the line-breaking situations that are describable in CSS. Given this, I think we'd be happy to pass more information to ICU to describe the environment about how to do line breaking. If ICU doesn't currently support overflow-wrap:break-spaces behavior, we shouldn't implement it in WebKit; we should instead either wait for ICU or specifically ask ICU to support it. If ICU thinks that they will never support overflow-wrap:break-spaces behavior, we should push against it in the W3C.

Regardless of what ICU's stance or current support level is, we should start on the task of deleting more of WebKit's line breaking code in favor of using ICU. This involves a few different tasks:
1) Discovering which parts of CSS-Text ICU already supports
2) Measuring the performance of their line breaking algorithms, and, if necessary implement some system (caching?) to try to speed it up.

This project is something I'd like to do in the coming years.
Comment 3 Myles C. Maxfield 2017-11-20 13:36:02 PST
(In reply to Myles C. Maxfield from comment #2)
> (In reply to Javier Fernandez from comment #1)
> > I'm asking Blink and Gecko engineers about the plans to implement this
> > feature. It'd be great to have an idea about the WebKit position and its
> > level of support to implement this.
> 
> It would be really great if we could rely on ICU more to handle more cases
> of line breaking. As I understand it, ICU aspires to handle all the
> line-breaking situations that are describable in CSS. Given this, I think
> we'd be happy to pass more information to ICU to describe the environment
> about how to do line breaking. If ICU doesn't currently support
> overflow-wrap:break-spaces behavior, we shouldn't implement it in WebKit; we
> should instead either wait for ICU or specifically ask ICU to support it. If
> ICU thinks that they will never support overflow-wrap:break-spaces behavior,
> we should push against it in the W3C.
> 
> Regardless of what ICU's stance or current support level is, we should start
> on the task of deleting more of WebKit's line breaking code in favor of
> using ICU. This involves a few different tasks:
> 1) Discovering which parts of CSS-Text ICU already supports
> 2) Measuring the performance of their line breaking algorithms, and, if
> necessary implement some system (caching?) to try to speed it up.
> 
> This project is something I'd like to do in the coming years.

I started looking at this more deeply, and I realized you are talking about a different property than the one I thought you were talking about. overflow-wrap doesn't create a new pattern of line-breaking opportunities; instead it says "if you didn't find any opportunities using one kind of iterator, use another kind of iterator." The novel piece here is the handoff from one iterator to another iterator; ICU already implements both kinds of iterators correctly. Therefore, we don't need to get involved with ICU for this feature. Sorry for the misdirection.

The break-spaces value of overflow-wrap only interacts with whitespace, and because there are only a few whitespace characters, handling this logic directly inside WebKit seems fine. You could probably do this quickly by hacking on BreakingContext::handleText() (and either opting out of simple line layout or implementing it there too). However, this function is probably the least maintainable function relating to text in the entire codebase, so rather than implementing support directly here, a better approach would be to put the implementation behind the TextBreakIterator API, and to try to implement the kind of handoff structure that we could use for overflow-wrap: break-word. Eventually, overflow-wrap: break-word and word-break: break-all should use the same grapheme-cluster-based iterator (and overflow-wrap: break-word would hand-off between this and a regular iterator, where word-break: break-all wouldn't).

This would be a fantastic first step toward the long-term plan to make BreakingContext::handleText() more understandable and maintainable.

Does this sound okay? What do you think about it?
Comment 4 Javier Fernandez 2018-05-02 13:24:49 PDT
(In reply to Myles C. Maxfield from comment #3)
> (In reply to Myles C. Maxfield from comment #2)
> > (In reply to Javier Fernandez from comment #1)
> > > I'm asking Blink and Gecko engineers about the plans to implement this
> > > feature. It'd be great to have an idea about the WebKit position and its
> > > level of support to implement this.
> > 
> > It would be really great if we could rely on ICU more to handle more cases
> > of line breaking. As I understand it, ICU aspires to handle all the
> > line-breaking situations that are describable in CSS. Given this, I think
> > we'd be happy to pass more information to ICU to describe the environment
> > about how to do line breaking. If ICU doesn't currently support
> > overflow-wrap:break-spaces behavior, we shouldn't implement it in WebKit; we
> > should instead either wait for ICU or specifically ask ICU to support it. If
> > ICU thinks that they will never support overflow-wrap:break-spaces behavior,
> > we should push against it in the W3C.
> > 
> > Regardless of what ICU's stance or current support level is, we should start
> > on the task of deleting more of WebKit's line breaking code in favor of
> > using ICU. This involves a few different tasks:
> > 1) Discovering which parts of CSS-Text ICU already supports
> > 2) Measuring the performance of their line breaking algorithms, and, if
> > necessary implement some system (caching?) to try to speed it up.
> > 
> > This project is something I'd like to do in the coming years.
> 
> I started looking at this more deeply, and I realized you are talking about
> a different property than the one I thought you were talking about.
> overflow-wrap doesn't create a new pattern of line-breaking opportunities;
> instead it says "if you didn't find any opportunities using one kind of
> iterator, use another kind of iterator." The novel piece here is the handoff
> from one iterator to another iterator; ICU already implements both kinds of
> iterators correctly. Therefore, we don't need to get involved with ICU for
> this feature. Sorry for the misdirection.
> 
> The break-spaces value of overflow-wrap only interacts with whitespace, and
> because there are only a few whitespace characters, handling this logic
> directly inside WebKit seems fine. You could probably do this quickly by
> hacking on BreakingContext::handleText() (and either opting out of simple
> line layout or implementing it there too). However, this function is
> probably the least maintainable function relating to text in the entire
> codebase, so rather than implementing support directly here, a better
> approach would be to put the implementation behind the TextBreakIterator
> API, and to try to implement the kind of handoff structure that we could use
> for overflow-wrap: break-word. Eventually, overflow-wrap: break-word and
> word-break: break-all should use the same grapheme-cluster-based iterator
> (and overflow-wrap: break-word would hand-off between this and a regular
> iterator, where word-break: break-all wouldn't).
> 
> This would be a fantastic first step toward the long-term plan to make
> BreakingContext::handleText() more understandable and maintainable.
> 
> Does this sound okay? What do you think about it?

Sorry for the very late reply. I've been waiting for the spec to stabilize and it seems that after the F2F in Berlin it's in a good state now. 

I'll start working on this now, but I still have to learn about this new codebase and the spec, which I'm not familiar with.
Comment 5 Myles C. Maxfield 2018-05-02 21:08:57 PDT
(In reply to Javier Fernandez from comment #4)
> (In reply to Myles C. Maxfield from comment #3)
> > (In reply to Myles C. Maxfield from comment #2)
> > > (In reply to Javier Fernandez from comment #1)
> > > > I'm asking Blink and Gecko engineers about the plans to implement this
> > > > feature. It'd be great to have an idea about the WebKit position and its
> > > > level of support to implement this.
> > > 
> > > It would be really great if we could rely on ICU more to handle more cases
> > > of line breaking. As I understand it, ICU aspires to handle all the
> > > line-breaking situations that are describable in CSS. Given this, I think
> > > we'd be happy to pass more information to ICU to describe the environment
> > > about how to do line breaking. If ICU doesn't currently support
> > > overflow-wrap:break-spaces behavior, we shouldn't implement it in WebKit; we
> > > should instead either wait for ICU or specifically ask ICU to support it. If
> > > ICU thinks that they will never support overflow-wrap:break-spaces behavior,
> > > we should push against it in the W3C.
> > > 
> > > Regardless of what ICU's stance or current support level is, we should start
> > > on the task of deleting more of WebKit's line breaking code in favor of
> > > using ICU. This involves a few different tasks:
> > > 1) Discovering which parts of CSS-Text ICU already supports
> > > 2) Measuring the performance of their line breaking algorithms, and, if
> > > necessary implement some system (caching?) to try to speed it up.
> > > 
> > > This project is something I'd like to do in the coming years.
> > 
> > I started looking at this more deeply, and I realized you are talking about
> > a different property than the one I thought you were talking about.
> > overflow-wrap doesn't create a new pattern of line-breaking opportunities;
> > instead it says "if you didn't find any opportunities using one kind of
> > iterator, use another kind of iterator." The novel piece here is the handoff
> > from one iterator to another iterator; ICU already implements both kinds of
> > iterators correctly. Therefore, we don't need to get involved with ICU for
> > this feature. Sorry for the misdirection.
> > 
> > The break-spaces value of overflow-wrap only interacts with whitespace, and
> > because there are only a few whitespace characters, handling this logic
> > directly inside WebKit seems fine. You could probably do this quickly by
> > hacking on BreakingContext::handleText() (and either opting out of simple
> > line layout or implementing it there too). However, this function is
> > probably the least maintainable function relating to text in the entire
> > codebase, so rather than implementing support directly here, a better
> > approach would be to put the implementation behind the TextBreakIterator
> > API, and to try to implement the kind of handoff structure that we could use
> > for overflow-wrap: break-word. Eventually, overflow-wrap: break-word and
> > word-break: break-all should use the same grapheme-cluster-based iterator
> > (and overflow-wrap: break-word would hand-off between this and a regular
> > iterator, where word-break: break-all wouldn't).
> > 
> > This would be a fantastic first step toward the long-term plan to make
> > BreakingContext::handleText() more understandable and maintainable.
> > 
> > Does this sound okay? What do you think about it?
> 
> Sorry for the very late reply. I've been waiting for the spec to stabilize
> and it seems that after the F2F in Berlin it's in a good state now. 
> 
> I'll start working on this now, but I still have to learn about this new
> codebase and the spec, which I'm not familiar with.

Okay. Please let me know how I can help. I'm very interested in this!
Comment 6 Javier Fernandez 2018-05-16 08:58:42 PDT
Created attachment 340493 [details]
Patch
Comment 7 Javier Fernandez 2018-05-16 09:06:32 PDT
The patch in attachment #340493 [details] is a very preliminary approach to implement the new 'break-spaces' value for the 'overflow-wrap' CSS property. 

The parsing logic is still uncompleted, since the new syntax allow a combination of 'break-word' and 'break-spaces' values (in any oder), but while still discussing about the best approach to implement this, I'd rather keep it simple. 

So, I've spent a few days trying to understand the codebase involved in inline-level boxes layout and the line-braking logic. I think I've got an overall idea of the general design and the classes involved. However, I still have doubts about where to implement the new line-breaking features.

As far I understand the spec [1] and the prose about the new value, the following points are the ones supporting my current (preliminary) approach. 

The new 'break-spaces' value does not introduce new breaking
opportunities, but just forbids collapsing [2] to let the current
line-breaking logic to handle those spaces as preserved white-spaces:

"However, if overflow-wrap is set to break-spaces, collapsing their
advance width is not allowed, as this would prevent the preserved
spaces from wrapping. "

Additionally, it introduces some restrictions [3] to where this
preserved sequence of white spaces can be broken:

" ... after the last white space character that would fit the line, or
after the first white space in the sequence if none would fit, or
before the first space in the sequence if none would fit and both
break-word and break-spaces are specified."

Comments and feedback are really welcome.