Bug 67114 - [EFL] API to get plain text of the frame
Summary: [EFL] API to get plain text of the frame
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit EFL (show other bugs)
Version: 528+ (Nightly build)
Hardware: All Linux
: P2 Normal
Assignee: Raphael Kubo da Costa (:rakuco)
URL:
Keywords:
Depends on:
Blocks: 62034
  Show dependency treegraph
 
Reported: 2011-08-29 01:22 PDT by Grzegorz Czajkowski
Modified: 2011-09-29 15:01 PDT (History)
5 users (show)

See Also:


Attachments
proposed patch (2.44 KB, patch)
2011-08-29 01:23 PDT, Grzegorz Czajkowski
no flags Details | Formatted Diff | Diff
Patch (2.61 KB, patch)
2011-09-19 08:31 PDT, Raphael Kubo da Costa (:rakuco)
no flags Details | Formatted Diff | Diff
Follow rniwa's suggestions (2.56 KB, patch)
2011-09-28 19:57 PDT, Raphael Kubo da Costa (:rakuco)
no flags Details | Formatted Diff | Diff
Follow rniwa's suggestions and make it compile (2.57 KB, patch)
2011-09-29 07:08 PDT, Raphael Kubo da Costa (:rakuco)
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Grzegorz Czajkowski 2011-08-29 01:22:17 PDT
This is an alternative API to ewk_frame_source_get. It gets plain text of the document without any HTML's tokens.
Comment 1 Grzegorz Czajkowski 2011-08-29 01:23:53 PDT
Created attachment 105472 [details]
proposed patch
Comment 2 Raphael Kubo da Costa (:rakuco) 2011-08-29 05:30:54 PDT
May I ask what are the use cases for exposing this API to the user? I don't see other ports calling outerText, and those who call innerText usually do that for DumpRenderTree only.

One other thing is that bug 66876 adds a similar private function (ewk_frame_inner_text_get); would that solve your needs if it was not private?
Comment 3 Grzegorz Czajkowski 2011-08-29 08:00:30 PDT
(In reply to comment #2)
> May I ask what are the use cases for exposing this API to the user? I don't see other ports calling outerText, and those who call innerText usually do that for DumpRenderTree only.

It's needed to save HTML Pages by application like email client as plain text.

> 
> One other thing is that bug 66876 adds a similar private function (ewk_frame_inner_text_get); would that solve your needs if it was not private?

I agree that these functions are similar but implementation is a little different.
Comment 4 Raphael Kubo da Costa (:rakuco) 2011-08-30 08:48:50 PDT
(In reply to comment #3)
> (In reply to comment #2)
> > One other thing is that bug 66876 adds a similar private function (ewk_frame_inner_text_get); would that solve your needs if it was not private?
> 
> I agree that these functions are similar but implementation is a little different.

The only noteworthy difference I see is that bug 66876 calls innerText() for the whole document, whereas you only call it for the body. Do the results differ too much? The function (either mine or yours) could accept a parameter specifying whether document or body should be used, if that's the case.
Comment 5 Grzegorz Czajkowski 2011-09-02 06:09:53 PDT
(In reply to comment #4)
> (In reply to comment #3)
> > (In reply to comment #2)
> > > One other thing is that bug 66876 adds a similar private function (ewk_frame_inner_text_get); would that solve your needs if it was not private?
> > 
> > I agree that these functions are similar but implementation is a little different.
> 
> The only noteworthy difference I see is that bug 66876 calls innerText() for the whole document, whereas you only call it for the body. Do the results differ too much? The function (either mine or yours) could accept a parameter specifying whether document or body should be used, if that's the case.

Hi Kubo,
Let me perform some tests with ewk_frame_inner_text_get() and I will leave a feedback if we really need body param in your implementation.
Comment 6 Kenneth Rohde Christiansen 2011-09-12 15:56:31 PDT
Comment on attachment 105472 [details]
proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=105472&action=review

> Source/WebKit/efl/ChangeLog:9
> +        This is an alternative API to ewk_frame_source_get. It gets plain text
> +        of the frame without any HTML's tokens.

What is that needed for? I mean is there an actual use case?
Comment 7 Raphael Kubo da Costa (:rakuco) 2011-09-12 18:11:11 PDT
(In reply to comment #6)
> What is that needed for? I mean is there an actual use case?

Please see comments #2 and #3.
Comment 8 Grzegorz Czajkowski 2011-09-15 07:14:57 PDT
(In reply to comment #5)
> (In reply to comment #4)
> > (In reply to comment #3)
> > > (In reply to comment #2)
> > > > One other thing is that bug 66876 adds a similar private function (ewk_frame_inner_text_get); would that solve your needs if it was not private?
> > > 
> > > I agree that these functions are similar but implementation is a little different.
> > 
> > The only noteworthy difference I see is that bug 66876 calls innerText() for the whole document, whereas you only call it for the body. Do the results differ too much? The function (either mine or yours) could accept a parameter specifying whether document or body should be used, if that's the case.
> 
> Hi Kubo,
> Let me perform some tests with ewk_frame_inner_text_get() and I will leave a feedback if we really need body param in your implementation.

Sorry for my late answer.
There is no any difference between results of ewk_frame_inner_text_get and ewk_frame_plain_text_get. You can make your function as public and then this bug should be marked as duplicated.
Comment 9 Raphael Kubo da Costa (:rakuco) 2011-09-15 14:07:02 PDT
(In reply to comment #8)
> There is no any difference between results of ewk_frame_inner_text_get and ewk_frame_plain_text_get. You can make your function as public and then this bug should be marked as duplicated.

Due to the course of the recent events (the other patch being rejected with the suggestion that a DumpRenderTreeSupportEfl should be implemented), I think it's now better to use this bug for this specific feature. I'll work on a marge of both of our patches and submit it here for review. For now, I'll clear the flags for the current patch.
Comment 10 Raphael Kubo da Costa (:rakuco) 2011-09-19 08:31:49 PDT
Created attachment 107859 [details]
Patch
Comment 11 Raphael Kubo da Costa (:rakuco) 2011-09-19 08:33:44 PDT
(In reply to comment #10)
> Created an attachment (id=107859) [details]

FWIW, this newer version puts the '*' in pointers on the left side again, as being discussed in bug 68231.
Comment 12 Ryosuke Niwa 2011-09-28 19:39:23 PDT
Comment on attachment 107859 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=107859&action=review

> Source/WebKit/efl/ewk/ewk_frame.cpp:1590
> +char* ewk_frame_plain_text_get(const Evas_Object* o)

Please don't use one-letter variable.

> Source/WebKit/efl/ewk/ewk_frame.cpp:1596
> +    if (sd->frame->view() && sd->frame->view()->layoutPending())
> +        sd->frame->view()->layout();

innerText automatically calls updateLayoutIgnorePendingStylesheets.

> Source/WebKit/efl/ewk/ewk_frame.cpp:1598
> +    WebCore::Element* documentElement = sd->frame->document()->documentElement();

I don't think we should assume that the frame has a document.
Comment 13 Raphael Kubo da Costa (:rakuco) 2011-09-28 19:57:27 PDT
Created attachment 109115 [details]
Follow rniwa's suggestions
Comment 14 Gyuyoung Kim 2011-09-29 01:10:51 PDT
Comment on attachment 109115 [details]
Follow rniwa's suggestions

Attachment 109115 [details] did not pass efl-ews (efl):
Output: http://queues.webkit.org/results/9898030
Comment 15 Gyuyoung Kim 2011-09-29 01:30:19 PDT
(In reply to comment #12)
> (From update of attachment 107859 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=107859&action=review
> 
> > Source/WebKit/efl/ewk/ewk_frame.cpp:1590
> > +char* ewk_frame_plain_text_get(const Evas_Object* o)
> 
> Please don't use one-letter variable.

Hello Ryosuke,

one-letter variable is efl coding style. However, we are trying to be compliant with webkit coding style in Bug 68209. This one-letter variable is also one of discussing items. I'm going to file a bug for this one-letter variable, then I would like to fix all one-letter variable at a time using a script(e.g uncrustify) after discussing efl contributors enough. So, could you accept one-letter variable in this patch for now ?
Comment 16 Gyuyoung Kim 2011-09-29 01:59:38 PDT
I file a Bug 69073 for one-letter variable.
Comment 17 Raphael Kubo da Costa (:rakuco) 2011-09-29 07:08:03 PDT
Created attachment 109158 [details]
Follow rniwa's suggestions and make it compile
Comment 18 Raphael Kubo da Costa (:rakuco) 2011-09-29 07:11:02 PDT
(In reply to comment #15)
> Hello Ryosuke,

He's not on the CC list ;)

> So, could you accept one-letter variable in this patch for now ?

I think the way to go here is whatever makes the reviewer happier -- even if this case of a one-letter variable is fixed here, uncrustify wouldn't break because of that.
Comment 19 Ryosuke Niwa 2011-09-29 11:48:59 PDT
Comment on attachment 109158 [details]
Follow rniwa's suggestions and make it compile

View in context: https://bugs.webkit.org/attachment.cgi?id=109158&action=review

> Source/WebKit/efl/ewk/ewk_frame.cpp:1592
> +    EWK_FRAME_SD_GET_OR_RETURN(ewkFrame, sd, 0);

sd? Is this a commonly-used / well-known term?

> Source/WebKit/efl/ewk/ewk_frame.cpp:1595
> +    if (!sd->frame->document())

You can probably cache Document* here.
Comment 20 Ryosuke Niwa 2011-09-29 13:59:21 PDT
Comment on attachment 109158 [details]
Follow rniwa's suggestions and make it compile

View in context: https://bugs.webkit.org/attachment.cgi?id=109158&action=review

>> Source/WebKit/efl/ewk/ewk_frame.cpp:1592
>> +    EWK_FRAME_SD_GET_OR_RETURN(ewkFrame, sd, 0);
> 
> sd? Is this a commonly-used / well-known term?

Apparently it is.
Comment 21 WebKit Review Bot 2011-09-29 15:01:43 PDT
Comment on attachment 109158 [details]
Follow rniwa's suggestions and make it compile

Clearing flags on attachment: 109158

Committed r96362: <http://trac.webkit.org/changeset/96362>
Comment 22 WebKit Review Bot 2011-09-29 15:01:48 PDT
All reviewed patches have been landed.  Closing bug.