Bug 197535 - [GLib] Expose typed arrays in the API
Summary: [GLib] Expose typed arrays in the API
Status: ASSIGNED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: Other
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Adrian Perez
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-05-02 16:00 PDT by Adrian Perez
Modified: 2019-05-27 04:38 PDT (History)
9 users (show)

See Also:


Attachments
WIP Patch v1 (7.54 KB, patch)
2019-05-02 16:05 PDT, Adrian Perez
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews210 for win-future (13.45 MB, application/zip)
2019-05-02 17:45 PDT, Build Bot
no flags Details
WIP Patch v2 (13.60 KB, patch)
2019-05-08 05:47 PDT, Adrian Perez
aperez: review?
ews: commit-queue-
Details | Formatted Diff | Diff
Archive of layout-test-results from ews213 for win-future (13.47 MB, application/zip)
2019-05-09 02:06 PDT, Build Bot
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Adrian Perez 2019-05-02 16:00:33 PDT
Typed arrays are necessary to pass chunks of data i.e. a byte buffer
as Uint8Array between the C/C++ and JavaScript worlds. Using strings
is not feasible for many uses because where the representation of
the data in memory must be the same, bit-by-bit, on both sides.
Comment 1 Adrian Perez 2019-05-02 16:05:07 PDT
Created attachment 368836 [details]
WIP Patch v1


Here goes an initial version of the patch with the proposed API
additions. It still needs API tests and documentation, at least.
Comment 2 Build Bot 2019-05-02 16:07:43 PDT
Attachment 368836 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/API/glib/JSCValue.cpp:1475:  jsc_value_typed_array__from_data is incorrectly named. Don't use underscores in your identifier names.  [readability/naming/underscores] [4]
ERROR: Source/JavaScriptCore/API/glib/JSCValue.cpp:1481:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/API/glib/JSCValue.cpp:1488:  jsc_value_is_typed_array is incorrectly named. Don't use underscores in your identifier names.  [readability/naming/underscores] [4]
ERROR: Source/JavaScriptCore/API/glib/JSCValue.cpp:1502:  jsc_value_typed_array_get_length is incorrectly named. Don't use underscores in your identifier names.  [readability/naming/underscores] [4]
ERROR: Source/JavaScriptCore/API/glib/JSCValue.cpp:1520:  jsc_value_typed_array_get_contents is incorrectly named. Don't use underscores in your identifier names.  [readability/naming/underscores] [4]
WARNING: File exempt from style guide. Skipping: "Source/JavaScriptCore/API/glib/JSCValue.h"
Total errors found: 5 in 3 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 3 Michael Catanzaro 2019-05-02 16:29:28 PDT
The following API are missing documentation:
	JSCTypedArrayType
	jsc_value_is_typed_array
	jsc_value_typed_array_get_contents
	jsc_value_typed_array_get_length
	jsc_value_typed_array_get_type
	jsc_value_typed_array_new
	jsc_value_typed_array_new_from_data
Comment 4 Build Bot 2019-05-02 17:45:32 PDT
Comment on attachment 368836 [details]
WIP Patch v1

Attachment 368836 [details] did not pass win-ews (win):
Output: https://webkit-queues.webkit.org/results/12069414

New failing tests:
security/contentSecurityPolicy/video-with-file-url-allowed-by-media-src-star.html
Comment 5 Build Bot 2019-05-02 17:45:34 PDT
Created attachment 368852 [details]
Archive of layout-test-results from ews210 for win-future

The attached test failures were seen while running run-webkit-tests on the win-ews.
Bot: ews210  Port: win-future  Platform: CYGWIN_NT-10.0-17763-3.0.5-338.x86_64-x86_64-64bit
Comment 6 Adrian Perez 2019-05-03 01:26:22 PDT
(In reply to Michael Catanzaro from comment #3)
> The following API are missing documentation:
> 	JSCTypedArrayType
> 	jsc_value_is_typed_array
> 	jsc_value_typed_array_get_contents
> 	jsc_value_typed_array_get_length
> 	jsc_value_typed_array_get_type
> 	jsc_value_typed_array_new
> 	jsc_value_typed_array_new_from_data

Of course, I am perfectly aware; I just wanted to share an initial
WIP version of the patch to get the discussion started about the
new API and avoid writing documentation for functions which might
get changed or removed before being rubber-stamped ;-)

Now, regarding the API, reviewers may be wondering why GBytes,
GArray and/or GByteArray are *not* used. The reason is that their
semantics do not match typed arrays well: 

 - Typed arrays are fixed size, which discards usage of GArray
   and GByteArray. These GLib types allow the backing memory
   buffer to be resized e.g. via g_array_append_val() or
   g_byte_array_set_size().

 - While GBytes seems like a good match on the surface, the
   expectation is that the contents of the backing memory buffer
   are immutable. This does not hold true for a JS typed array,
   where individual elements can be mutated.

Because of the reasons above, I settled on having _new_from_data()
take a pointer+size pair, and an optional GDestroyNotify. On the
other end of the spectrum, a raw pointer and the size of the backing
buffer in bytes can be obtained directly instead of having those
wrapped into some helper object — like for GMappedFile, which I
presume has the same rationale behind its _get_length() and
_get_contents() methods.

Last but not least, it will be important to note that the pointer
returned by _get_contents() may become invalid between JSC API
calls because JSC might change the underlying array buffer under
the hood. This will be something to stress in the API reference
documentation.
Comment 7 Michael Catanzaro 2019-05-03 06:09:50 PDT
Comment on attachment 368836 [details]
WIP Patch v1

View in context: https://bugs.webkit.org/attachment.cgi?id=368836&action=review

> Source/JavaScriptCore/API/glib/JSCValue.cpp:1520
> +char* jsc_value_typed_array_get_contents(JSCValue* value, gsize* len)

Use guchar* to indicate that it's not a string.

(char*/gchar* is guaranteed to be a UTF-8 string in GObject APIs.)
Comment 8 Adrian Perez 2019-05-04 09:56:49 PDT
(In reply to Michael Catanzaro from comment #7)
> Comment on attachment 368836 [details]
> WIP Patch v1
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=368836&action=review
> 
> > Source/JavaScriptCore/API/glib/JSCValue.cpp:1520
> > +char* jsc_value_typed_array_get_contents(JSCValue* value, gsize* len)
> 
> Use guchar* to indicate that it's not a string.
> 
> (char*/gchar* is guaranteed to be a UTF-8 string in GObject APIs.)

Ah, this is a good point, let's change the types to guchar* =)

I have been testing out the new API myself and in general it works
nicely. The only modification I think that would be good to have
is adding an “user_data” parameter to the function that creates a
new typed array from an existing data buffer, like the following:

  JSCValue*
  jsc_value_new_typed_array_from_data (JSCContext       *context,
                                       JSCTypedArrayType type,
                                       gpointer          data,
                                       gsize             data_bytes,
                                       GDestroyNotify    destroy_notify,
                                       gpointer          user_data);

With this modification, what would be passed to the “destroy_notify”
callback would be “user_data”, instead of “data”. This is desirable
because the “data” buffer might belong to some other structure — note
that passing the same value for “data” and “user_data” would still
achieve the goal of passing “data” to the “destroy_notify” callback.

If you want a concrete example where “user_data” actually needs to be
different, consider this example (sans error checking, to keep it brief):

  JSCValue*
  js_map_file (const char* file_path) {
      GMappedFile *f = g_mapped_file_new (file_path, TRUE, NULL);
      return jsc_value_new_typed_array_from_data (jsc_context_get_current (),
                                                  JSC_TYPED_ARRAY_UINT8,
                                                  g_mapped_file_get_contents (f),
                                                  g_mapped_file_get_length (f),
                                                  (GDestroyNotify) g_mapped_file_unref,
                                                  f);
  }

Note how we do *not* want the “destroy_notify” to be applied to the
data buffer pointer; in this case we want to apply “g_mapped_file_unref”
on the GMappedFile itself instead.

I will do this change, which is important for many use cases, then add
tests/documentation, and post an updated patch :-)
Comment 9 Adrian Perez 2019-05-08 05:47:05 PDT
Created attachment 369379 [details]
WIP Patch v2


This version of the patch has the following changes from v1:

- New API functions and types now have documentation.

- jsc_typed_array_new_from_data() now can take an “user_data”, which is
  what gets passed to the “destroy_notify” callback. This provides some
  needed flexibility to allow wrapping something more complex than a raw
  memory buffer as a typed array (check the documentation, which uses a
  GMappedFile to exemplify this).

- jsc_typed_array_get_contents() now returns “gpointer”. I think this is
  better than the “guchar*” suggestion by Michael, because it forces the
  users of the function to think and deliberately choose which pointer
  type they need to cast the returned value to. This is good because
  typed array contents can be elements of different types, and forcing
  users to make an explicit choice avoids a potential footgun.

The patch still needs adding unit tests for the new API, which is the
next thing I will be doing. Any feedback that reviewers can provide in
the meantimea bout the patch will be valuable (thanks!)
Comment 10 Build Bot 2019-05-08 05:49:49 PDT
Attachment 369379 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/API/glib/JSCValue.cpp:1493:  jsc_value__typed_array is incorrectly named. Don't use underscores in your identifier names.  [readability/naming/underscores] [4]
ERROR: Source/JavaScriptCore/API/glib/JSCValue.cpp:1554:  jsc_value__typed_array_from_data is incorrectly named. Don't use underscores in your identifier names.  [readability/naming/underscores] [4]
ERROR: Source/JavaScriptCore/API/glib/JSCValue.cpp:1584:  jsc_value_is_typed_array is incorrectly named. Don't use underscores in your identifier names.  [readability/naming/underscores] [4]
ERROR: Source/JavaScriptCore/API/glib/JSCValue.cpp:1608:  jsc_value_typed_array_get_length is incorrectly named. Don't use underscores in your identifier names.  [readability/naming/underscores] [4]
ERROR: Source/JavaScriptCore/API/glib/JSCValue.cpp:1650:  jsc_value_typed_array_get_contents is incorrectly named. Don't use underscores in your identifier names.  [readability/naming/underscores] [4]
WARNING: File exempt from style guide. Skipping: "Source/JavaScriptCore/API/glib/JSCValue.h"
Total errors found: 5 in 4 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 11 Build Bot 2019-05-09 02:06:06 PDT
Comment on attachment 369379 [details]
WIP Patch v2

Attachment 369379 [details] did not pass win-ews (win):
Output: https://webkit-queues.webkit.org/results/12141767

New failing tests:
http/tests/css/filters-on-iframes.html
Comment 12 Build Bot 2019-05-09 02:06:09 PDT
Created attachment 369480 [details]
Archive of layout-test-results from ews213 for win-future

The attached test failures were seen while running run-webkit-tests on the win-ews.
Bot: ews213  Port: win-future  Platform: CYGWIN_NT-10.0-17763-3.0.5-338.x86_64-x86_64-64bit
Comment 13 Carlos Garcia Campos 2019-05-27 04:38:47 PDT
Comment on attachment 369379 [details]
WIP Patch v2

View in context: https://bugs.webkit.org/attachment.cgi?id=369379&action=review

> Source/JavaScriptCore/API/glib/JSCValue.cpp:1484
> + * @len: number of elements in the array

length

> Source/JavaScriptCore/API/glib/JSCValue.cpp:1554
> +JSCValue* jsc_value_new_typed_array_from_data(JSCContext* context, JSCTypedArrayType type, gpointer data, gsize dataBytes, gpointer userData, GDestroyNotify destroyNotify)

I wonder if we should expose ArrayBuffer (we probably need to do it anyway) and then leave the destruction thing to array buffer itself, and expose only new_typed_array_from_array_buffer (not this data one).

> Source/JavaScriptCore/API/glib/JSCValue.cpp:1557
> +

Is it possible to create a typed array with null data? and 0 size then?

> Source/JavaScriptCore/API/glib/JSCValue.cpp:1560
> +        deallocatorContext = static_cast<TypedArrayDeallocatorContext*>(fastMalloc(sizeof(TypedArrayDeallocatorContext)));

You can use WEBKIT_DEFINE_ASYNC_DATA_STRUCT macro.

> Source/JavaScriptCore/API/glib/JSCValue.cpp:1602
> + * Obtains the size in bytes of the memory region which holds the contents of a typed array.

This is confusing, I would expect the lenght to be the number of elements in the array, not the data size in bytes.

> Source/JavaScriptCore/API/glib/JSCValue.cpp:1604
> + * Returns: size in bytes or zero if @value is not a typed array

Wouldn't we return 0 also if the array is empty?

> Source/JavaScriptCore/API/glib/JSCValue.cpp:1619
> +    size_t len = JSObjectGetTypedArrayLength(jsContext, jsObject, &exception);

This is indeed the amount of elements in array.

> Source/JavaScriptCore/API/glib/JSCValue.cpp:1627
> + * jsc_value_typed_array_get_contents:

If we expose array buffer, this would be part of the array buffer api. I think it's would be less confusing, because typed array woudl always be about elements and array buffer about the actual data and size in bytes. Note also that is possible to provide an offset.

> Source/JavaScriptCore/API/glib/JSCValue.cpp:1629
> + * @len: (nullable): if not `NULL`, location of a variable where to store the length in bytes

length. %NULL