Bug 7562 - Safari does not allow to access (copy) CSS generated content
Summary: Safari does not allow to access (copy) CSS generated content
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: CSS (show other bugs)
Version: 417.x
Hardware: Mac OS X 10.4
: P2 Normal
Assignee: Nobody
URL: http://www.witch.westfalen.de/csstest...
Keywords: HasReduction
Depends on:
Blocks:
 
Reported: 2006-03-02 14:11 PST by Jutta Wrage
Modified: 2015-11-12 15:57 PST (History)
15 users (show)

See Also:


Attachments
minimal test case (373 bytes, text/html)
2008-01-29 23:00 PST, Robert Blaut
no flags Details
Demo of adding in quote marks (1.76 KB, patch)
2010-04-13 15:04 PDT, Nicholas Wilson
eric: review-
Details | Formatted Diff | Diff
css content example (1.41 KB, text/html)
2011-04-25 09:36 PDT, Gregg Tavares
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jutta Wrage 2006-03-02 14:11:23 PST
If content it generated by css like the URL next to the blockquotes or the quotation marks in that page, one can neither copy and paste it nor access it otherwise.

The result for the quotation marks is that all copied citations are without their quotation marks. The URL of the citation source generated as visible content from cite attribute of the blockquote must be typed in manually, to access that URL.

Would be nice, if accessing generated content worked like in Opera.

Jutta
Comment 1 tim bates 2006-04-29 13:48:51 PDT
I can confirm that this is the case: the german quotes on the example URL above are not selected when the user makes drag selection starting from an angle quote.

Moreover, while any quotes lying within a selection will appear as if selected, copying and pasting the selection into a text editor reveals that they are not placed on the paste board.

This remains true in the nightly (as of 29 April, 2006 (420+) )
Comment 2 Dave Hyatt 2006-04-29 13:58:03 PDT
The only real solution I can think of to this problem is to actually morph the generated content into full-fledged DOM nodes when putting it on the pasteboard.
Comment 3 mitz 2006-04-29 15:15:46 PDT
innerText suffers from the same problem.
Comment 4 Robert Burns 2007-02-28 13:33:53 PST
This should depend on which pasteboard type we're talking about. Per David H's remark above, the DOM sshould not be changed for the WebARchive pastboard type, however the RichText and Text paseboard types should always include the generated content. Just as the styles applied to the elements shhould be included in the RichText pasteboard.

Once again, for WebArchives, the DOM should not be altered at all. The stylesheets should be included in the WebArchive, along with the main resource and other resources unchanged.
Comment 5 Robert Blaut 2008-01-29 23:00:43 PST
Created attachment 18789 [details]
minimal test case
Comment 6 Nicholas Wilson 2010-04-13 15:04:01 PDT
Created attachment 53285 [details]
Demo of adding in quote marks

I am unconvinced generally that generated content should be copied. It is, after all, presentational and not to do with the content. The semantic markup ought to be enough to determine most of the plain text representation, though I guess it makes sense to 'lynx-ify' it a bit. Indeed, with most generated content it is intended that it not be copied: think of all the fancy ¶ (pilcrow) signs after paragraphs, or 'content: "#" attr(id)' used on headings, or line numbers on code. Those are presentational aspects only intended for the rendered presentation (media: screen), not a plain text one.

On the other hand, q is a bit different, because the instruction to include quotation marks is not just in the CSS presentational instruction, but is also a genuine command embedded in the semantics of the element:

"Quotation punctuation (such as quotation marks) that is quoting the contents of the element [that is, q] must not appear immediately before, after, or inside elements; they *will be* inserted into the rendering by the user agent."[HTML5]

I would suggest that "WebKit does not copy generated content" is not really what the bug is about. What is missing is that the plain text representation a Range should include quote marks around q elements.

The solution then ought not to involve treating some general case and transforming data from the render tree to use for pasting. It is instead more 'correct' just to stick with pasting real DOM content, as we are already doing, and add in a rule to appropriately insert quotation marks as UAs are instructed.

The q tag is included in the html version put on the paste buffer, so other applications picking it up from there should follow the same logic and apply the quote marks themselves. Note however that popular applications do not do that (OO.o for example applies a sort of default style when it transforms pasted HTML into its own formatting, but that does not include quote marks around q; not our bug really though).

In any case, it has taken longer for me to type this comment than type the patch, which is a rough demonstration of how it would work, as I have not had time to worked out all of the case TextIterator deals with. It is a basic five line fix in the immediate term though.

The big blocker to doing this properly is bug 6503 which prevents this being properly language dependent. Then the proper UA style can be applied (bug 3234).
Comment 7 WebKit Review Bot 2010-04-13 15:07:42 PDT
Attachment 53285 [details] did not pass style-queue:

Failed to run "WebKitTools/Scripts/check-webkit-style" exit_code: 1
WebCore/ChangeLog:6:  Line contains tab character.  [whitespace/tab] [5]
Total errors found: 1 in 2 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 8 Nicholas Wilson 2010-04-13 16:50:52 PDT
Sorry to spam, but there are definitely some fancy cases in TextIterator.cpp I have not got right. The patch has some slight regressions. It is clear what I am trying to do though, but I do not know enough about the way the RTL and whitespace formatting works to not break those. If someone who knows the file or line-break model better could help, I would be grateful.

I think what we would ultimately like to do is a bit different, which is to check every node we process, doing:
  (if parent is a qTag && this is parent's first child && startOffset = 0)
    then emit the start quote
and at the end
  (if parent is qTag && this is parent's last child && endOffset = length of this node)
    then emit the end quote

To get it really right, with correctly alternating nested quotes, a few extra chunks of code will be needed. Is this at all the right approach to be taking?
Comment 9 Eric Seidel (no email) 2010-05-07 23:21:25 PDT
Comment on attachment 53285 [details]
Demo of adding in quote marks

The style errors will prevent this from landing, particularly the tabs.

Also, every change requires a test or explanation of why testing is impossible.  In this case we can use execCommand("copy") to test.
Comment 10 Adam Barth 2010-10-08 21:25:26 PDT
Comment on attachment 53285 [details]
Demo of adding in quote marks

This patch does not seem like the right approach.
Comment 11 Gregg Tavares 2011-04-25 09:36:28 PDT
Created attachment 90921 [details]
css content example

Here's another example. To me at least, Apple being the UX kings, 
this should be a no-brainer. The user sees the content, they should
be able to search for it and select it.

What steps will reproduce the problem?
1. view the attached page
2. Select all (Ctrl-A or Command-A)
3. Copy
4. Paste into notepad or TextEdit)

What is the expected output? 

You should see 

Chapter 1
bla bla bla bla
Chapter 2
bla bla bla bla
Chapter 3
bla bla bla bla

What do you see instead?

bla bla bla bla
bla bla bla bla
bla bla bla bla

Similarly try to search for "Chapter"

Note: I understand that the content is in CSS but from the user's point of view they see the words on the screen they expect to be able to search for them, select them and copy and paste them. This comes up in the Khronos specs, especially the WebGL spec.

http://www.khronos.org/registry/webgl/specs/latest/

Knowing there are several areas that say "non-normative" a user goes to the page to reference one. They press Ctrl-F or Cmd-F and type "non-norm" and nothing matches even though they see "non-normative" displayed on the page. That's a really bad UX.
Comment 12 Jeffrey Yasskin 2012-01-24 13:52:12 PST
At https://www.w3.org/Bugs/Public/show_bug.cgi?id=15561, Ian Hickson writes, "Copy-and-paste should copy what the user thinks it will copy, which includes generated content."

Re #6: As you mentioned, one way to let authors control which generated content is copied would be to look at the media type. On the other hand, clickable "¶" marks may not be an issue since there's not yet a standard way to add onclick handlers to generated content, so most of those are real nodes.