WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
129626
WebGL slow video to texture
https://bugs.webkit.org/show_bug.cgi?id=129626
Summary
WebGL slow video to texture
Florian Bösch
Reported
2014-03-03 15:03:20 PST
The performance of texImage2D and texSubImage2D calls passing a video is very slow (+4ms) for most video sizes (+240p). The linked in test can record the timings these uploads and the framerate, and it contains historical measurements across a variety of browsers and platforms.
Attachments
Patch
(16.10 KB, patch)
2015-05-11 10:29 PDT
,
Jer Noble
no flags
Details
Formatted Diff
Diff
Patch
(17.34 KB, patch)
2015-05-11 10:47 PDT
,
Jer Noble
no flags
Details
Formatted Diff
Diff
Archive of layout-test-results from ews100 for mac-mavericks
(593.19 KB, application/zip)
2015-05-11 11:41 PDT
,
Build Bot
no flags
Details
Archive of layout-test-results from ews107 for mac-mavericks-wk2
(606.06 KB, application/zip)
2015-05-11 11:48 PDT
,
Build Bot
no flags
Details
Patch
(20.40 KB, patch)
2015-05-11 13:11 PDT
,
Jer Noble
no flags
Details
Formatted Diff
Diff
Archive of layout-test-results from ews101 for mac-mavericks
(526.10 KB, application/zip)
2015-05-11 14:04 PDT
,
Build Bot
no flags
Details
Archive of layout-test-results from ews106 for mac-mavericks-wk2
(565.61 KB, application/zip)
2015-05-11 14:10 PDT
,
Build Bot
no flags
Details
Patch
(20.40 KB, patch)
2015-05-11 14:53 PDT
,
Jer Noble
no flags
Details
Formatted Diff
Diff
Archive of layout-test-results from ews100 for mac-mavericks
(525.99 KB, application/zip)
2015-05-11 15:46 PDT
,
Build Bot
no flags
Details
Archive of layout-test-results from ews104 for mac-mavericks-wk2
(561.33 KB, application/zip)
2015-05-11 15:52 PDT
,
Build Bot
no flags
Details
Patch
(30.52 KB, patch)
2015-08-20 16:58 PDT
,
Jer Noble
no flags
Details
Formatted Diff
Diff
Patch
(30.47 KB, patch)
2015-08-21 09:53 PDT
,
Jer Noble
no flags
Details
Formatted Diff
Diff
Archive of layout-test-results from ews101 for mac-mavericks
(632.56 KB, application/zip)
2015-08-21 10:30 PDT
,
Build Bot
no flags
Details
Archive of layout-test-results from ews104 for mac-mavericks-wk2
(620.06 KB, application/zip)
2015-08-21 10:31 PDT
,
Build Bot
no flags
Details
Patch
(28.64 KB, patch)
2015-09-25 15:34 PDT
,
Jer Noble
no flags
Details
Formatted Diff
Diff
Archive of layout-test-results from ews100 for mac-mavericks
(733.16 KB, application/zip)
2015-09-25 15:56 PDT
,
Build Bot
no flags
Details
Archive of layout-test-results from ews104 for mac-mavericks-wk2
(664.34 KB, application/zip)
2015-09-25 16:16 PDT
,
Build Bot
no flags
Details
Patch
(28.69 KB, patch)
2015-09-25 23:23 PDT
,
Jer Noble
no flags
Details
Formatted Diff
Diff
Patch
(28.64 KB, patch)
2015-09-28 13:11 PDT
,
Jer Noble
dino
: review+
Details
Formatted Diff
Diff
Patch for landing
(28.51 KB, patch)
2015-11-18 16:40 PST
,
Jer Noble
no flags
Details
Formatted Diff
Diff
Show Obsolete
(14)
View All
Add attachment
proposed patch, testcase, etc.
Florian Bösch
Comment 1
2014-08-09 03:54:58 PDT
Retested with Chrome 36.0.1985.125, Firefox Aurora 33.0a2, Safari 6.1.4 and Internet Explorer 11 TL;DR Firefox now outperforms Chrome by around 600% on OSX and Linux. Safari has the worst performance and worst video format support. Summary of result tables, for full results please see:
http://codeflow.org/issues/slow_video_to_texture/
Ubuntu: FPS UPms FPS UPms FPS UPms Chrome 36 1080p 46.42 8.08 46.22 7.97 46.72 7.80 Chrome 35 1080p 45.28 8.02 45.10 8.03 45.39 7.98 Firefox Aurora 1080p 60.04 1.00 60.04 1.14 60.02 1.18 OSX: FPS UPms FPS UPms FPS UPms Chrome 36 1080p 59.56 9.91 59.72 9.96 59.78 9.91 Chrome 35 1080p 59.77 13.47 58.88 14.51 59.85 13.69 Firefox Aurora 1080p N/A N/A 59.50 1.54 59.07 1.50 Safari 6.1.4 1080p 20.07 34.94 N/A N/A N/A N/A Windows 8.1: FPS UPms FPS UPms FPS UPms Chrome 36 1080p 51.22 11.53 59.53 7.16 59.23 7.23 Firefox Aurora 1080p 57.14 9.55 56.89 9.41 IE11 video not supported for upload Conclusion: Firefox has made good efforts in its Aurora build (scheduled to be beta early September, and release early October) on OSX and Linux, these now outperform Chrome by a substantial margin (by ~600%) and pushed upload times into the range of 1-2ms. Chrome 36 has also made some gains on Chrome 35 on OSX (by ~45%). Firefox has yet to implement their improvements on Windows and has also got to deal with bugs in their h.264 support on Windows. Safari Issues: - Only H.264 is supported - The H.264 support is very slow and choppy
Jer Noble
Comment 2
2015-05-11 10:29:35 PDT
Created
attachment 252863
[details]
Patch
Jer Noble
Comment 3
2015-05-11 10:47:22 PDT
Created
attachment 252865
[details]
Patch
Build Bot
Comment 4
2015-05-11 11:41:19 PDT
Comment on
attachment 252865
[details]
Patch
Attachment 252865
[details]
did not pass mac-ews (mac): Output:
http://webkit-queues.appspot.com/results/4992284621799424
New failing tests: webgl/1.0.2/conformance/textures/tex-image-and-uniform-binding-bugs.html webgl/1.0.2/conformance/textures/texture-npot-video.html fast/canvas/webgl/tex-image-and-uniform-binding-bugs.html
Build Bot
Comment 5
2015-05-11 11:41:21 PDT
Created
attachment 252875
[details]
Archive of layout-test-results from ews100 for mac-mavericks The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews100 Port: mac-mavericks Platform: Mac OS X 10.9.5
Build Bot
Comment 6
2015-05-11 11:48:33 PDT
Comment on
attachment 252865
[details]
Patch
Attachment 252865
[details]
did not pass mac-wk2-ews (mac-wk2): Output:
http://webkit-queues.appspot.com/results/6589910987309056
New failing tests: webgl/1.0.2/conformance/textures/tex-image-and-uniform-binding-bugs.html webgl/1.0.2/conformance/textures/texture-npot-video.html
Build Bot
Comment 7
2015-05-11 11:48:35 PDT
Created
attachment 252877
[details]
Archive of layout-test-results from ews107 for mac-mavericks-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews107 Port: mac-mavericks-wk2 Platform: Mac OS X 10.9.5
Jer Noble
Comment 8
2015-05-11 13:11:51 PDT
Created
attachment 252886
[details]
Patch Added iOS support; fixed a crash in the conformance tests.
Build Bot
Comment 9
2015-05-11 14:04:02 PDT
Comment on
attachment 252886
[details]
Patch
Attachment 252886
[details]
did not pass mac-ews (mac): Output:
http://webkit-queues.appspot.com/results/6631852953567232
New failing tests: webgl/1.0.2/conformance/textures/texture-npot-video.html
Build Bot
Comment 10
2015-05-11 14:04:05 PDT
Created
attachment 252891
[details]
Archive of layout-test-results from ews101 for mac-mavericks The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews101 Port: mac-mavericks Platform: Mac OS X 10.9.5
Build Bot
Comment 11
2015-05-11 14:10:13 PDT
Comment on
attachment 252886
[details]
Patch
Attachment 252886
[details]
did not pass mac-wk2-ews (mac-wk2): Output:
http://webkit-queues.appspot.com/results/5513033837182976
New failing tests: webgl/1.0.2/conformance/textures/texture-npot-video.html
Build Bot
Comment 12
2015-05-11 14:10:16 PDT
Created
attachment 252892
[details]
Archive of layout-test-results from ews106 for mac-mavericks-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews106 Port: mac-mavericks-wk2 Platform: Mac OS X 10.9.5
Jer Noble
Comment 13
2015-05-11 14:53:44 PDT
Created
attachment 252893
[details]
Patch
Build Bot
Comment 14
2015-05-11 15:46:10 PDT
Comment on
attachment 252893
[details]
Patch
Attachment 252893
[details]
did not pass mac-ews (mac): Output:
http://webkit-queues.appspot.com/results/5106708993015808
New failing tests: webgl/1.0.2/conformance/textures/texture-npot-video.html
Build Bot
Comment 15
2015-05-11 15:46:12 PDT
Created
attachment 252898
[details]
Archive of layout-test-results from ews100 for mac-mavericks The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews100 Port: mac-mavericks Platform: Mac OS X 10.9.5
Jon Lee
Comment 16
2015-05-11 15:48:30 PDT
Comment on
attachment 252893
[details]
Patch Tests?
Build Bot
Comment 17
2015-05-11 15:52:40 PDT
Comment on
attachment 252893
[details]
Patch
Attachment 252893
[details]
did not pass mac-wk2-ews (mac-wk2): Output:
http://webkit-queues.appspot.com/results/5505874126700544
New failing tests: webgl/1.0.2/conformance/textures/texture-npot-video.html
Build Bot
Comment 18
2015-05-11 15:52:43 PDT
Created
attachment 252900
[details]
Archive of layout-test-results from ews104 for mac-mavericks-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews104 Port: mac-mavericks-wk2 Platform: Mac OS X 10.9.5
Jer Noble
Comment 19
2015-08-12 22:53:54 PDT
Florian, your server does not support HTTP Range Requests; as such, WebKit falls back to 32-bit QuickTime to load, decode, and display these movies. This may go a long way towards explaining why WebKit seems to take so much longer to upload images.
Florian Bösch
Comment 20
2015-08-13 00:09:46 PDT
(In reply to
comment #19
)
> Florian, your server does not support HTTP Range Requests; as such, WebKit > falls back to 32-bit QuickTime to load, decode, and display these movies. > This may go a long way towards explaining why WebKit seems to take so much > longer to upload images.
It's true that my server does not support range requests. However, two points about that. 1) So you're saying that to performantly put video into WebGL on the client you have to serve video with range requests. Or do you rather mean to express that this is a bug that you haven't filed yet? Do you want me to file it? I'd be happy to file "WebKit does not performantly handle video if the server does not support http range requests". Unfortunately that will then attract half a dozen other bug tickets, see point #2. 2) Browser behavior regarding to requests for video in regards to ranges is highly inconsistent, non http conformant and frequently leads to videos breaking. Actually supporting range requests for videos at this time is inadvisable, and the official recommendation I've gotten by Google and Apple is to use MSE to handle streaming video and not to rely on http range requests.
Jer Noble
Comment 21
2015-08-13 08:20:44 PDT
(In reply to
comment #20
)
> (In reply to
comment #19
) > > Florian, your server does not support HTTP Range Requests; as such, WebKit > > falls back to 32-bit QuickTime to load, decode, and display these movies. > > This may go a long way towards explaining why WebKit seems to take so much > > longer to upload images. > > It's true that my server does not support range requests. However, two > points about that. > > 1) So you're saying that to performantly put video into WebGL on the client > you have to serve video with range requests. Or do you rather mean to > express that this is a bug that you haven't filed yet? Do you want me to > file it? I'd be happy to file "WebKit does not performantly handle video if > the server does not support http range requests". Unfortunately that will > then attract half a dozen other bug tickets, see point #2.
No, I'm saying that expecting fast texture upload from an out-of-process, 32-bit media stack as ancient as QuickTime is unreasonable.
> 2) Browser behavior regarding to requests for video in regards to ranges is > highly inconsistent, non http conformant and frequently leads to videos > breaking. Actually supporting range requests for videos at this time is > inadvisable, and the official recommendation I've gotten by Google and Apple > is to use MSE to handle streaming video and not to rely on http range > requests.
That is entirely misinformed, and I HIGHLY doubt anyone at Apple told you to use MSE to do streaming video. HTTP Range Request support is a necessity. Without it, media-over-HTTP will not play at all in iOS, and soon, it will not be supported on desktop Safari as well.
Florian Bösch
Comment 22
2015-08-13 08:46:41 PDT
(In reply to
comment #21
)
> No, I'm saying that expecting fast texture upload from an out-of-process, > 32-bit media stack as ancient as QuickTime is unreasonable.
Then fix it. I see no reason you should do that no matter where the bytes you consume come from. That constitutes a bug if I ever saw one. Plus, do you even have a testcase of your own to check video upload speed? Cause I kinda doubt it, seeing as this issue goes into its 4th year aniversary and nobody done jack squat about it. So if you disagree with my testcase, or sever, or whatever, go make your own, I promise it's quite an eye opening experience to see how utterly horrid the implementation across most every UA, including yours, is.
> That is entirely misinformed, and I HIGHLY doubt anyone at Apple told you to > use MSE to do streaming video. > > HTTP Range Request support is a necessity. Without it, media-over-HTTP will > not play at all in iOS, and soon, it will not be supported on desktop Safari > as well.
Well, good luck with that, don't say I didn't warn you that it's a shambles.
Jer Noble
Comment 23
2015-08-13 09:06:17 PDT
(In reply to
comment #22
)
> (In reply to
comment #21
) > > No, I'm saying that expecting fast texture upload from an out-of-process, > > 32-bit media stack as ancient as QuickTime is unreasonable. > > Then fix it.
No. QuickTime and QTKit are deprecated technologies. Support for QTKit has been disabled in WebKit Nightlies, and will be disabled in Safari 9.
> I see no reason you should do that no matter where the bytes > you consume come from. That constitutes a bug if I ever saw one. Plus, do > you even have a testcase of your own to check video upload speed? Cause I > kinda doubt it, seeing as this issue goes into its 4th year anniversary
...2nd...
> and > nobody done jack squat about it.
You do realize you're directing this at the person who is currently in the middle of fixing the very issue you're complaining about (video upload speed), right?
> So if you disagree with my testcase, or > sever, or whatever, go make your own, I promise it's quite an eye opening > experience to see how utterly horrid the implementation across most every > UA, including yours, is.
The number of loads which fall back to QTKit due to lack of server Range support is incredibly, incredibly small. So, the balance of probabilities leads me to believe that the problem is not with HTTP Range requests, but how well your server implement them.
Eric Carlson
Comment 24
2015-08-13 09:09:47 PDT
Comment on
attachment 252893
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=252893&action=review
> Source/WebCore/platform/graphics/avfoundation/objc/MediaPlayerPrivateAVFoundationObjC.h:242 > + virtual bool copyVideoTextureToPlatformTexture(GraphicsContext3D*, Platform3DObject, GC3Denum target, GC3Dint level, GC3Denum internalFormat, GC3Denum format, GC3Denum type, bool premultiplyAlpha, bool flipY) override;
Nit: don't need both "virtual" and "override" (the later implies the former).
> Source/WebCore/platform/graphics/avfoundation/objc/MediaPlayerPrivateAVFoundationObjC.mm:2370 > +void MediaPlayerPrivateAVFoundationObjC::destroyOpenGLVideoOutput()
This is never called.
> Source/WebCore/platform/graphics/avfoundation/objc/MediaPlayerPrivateAVFoundationObjC.mm:2382 > +bool MediaPlayerPrivateAVFoundationObjC::openGLVideoOutputHasAvailableFrame()
Nit: this and updateLastOpenGLImage should take the item time instead of both calling -[AVPlayerItem currentTime]
> Source/WebCore/platform/graphics/avfoundation/objc/MediaPlayerPrivateAVFoundationObjC.mm:2406 > +void MediaPlayerPrivateAVFoundationObjC::updateLastOpenGLImage() > +{ > + if (!m_openGLVideoOutput) > + return; > + > + CMTime currentTime = [m_avPlayerItem currentTime]; > + if (![m_openGLVideoOutput hasNewPixelBufferForItemTime:currentTime]) > + return; > + > + m_lastOpenGLImage = adoptCF([m_openGLVideoOutput copyPixelBufferForItemTime:currentTime itemTimeForDisplay:nil]); > +}
Nit: this seems unnecessary - it is never called if m_openGLVideoOutput is null or there isn't a pixel buffer for the current time. Why not just use copyPixelBufferForItemTime:itemTimeForDisplay: directly?
Florian Bösch
Comment 25
2015-08-13 09:19:16 PDT
I won't entertain more discussions of http range or not, that's got nothing whatsoever todo with tex[Sub]Image2D speed. I have clocked an optimal path for putting the texels of a video into a texture using a C++ test program quite a while ago (just consult the chrome ticket that's been open for the last 4 years). If you are uploading YCbCr (4:2:2) to the GPU from ram and then convert to RGB on GPU this should not take more than 2ms (even on fairly slow machines). And if you already have a YCbCr texture in vram, this will take approximately 10 microseconds.
Florian Bösch
Comment 26
2015-08-13 09:52:19 PDT
Btw. my server does support ranges, I just checked, so I don't know why you would say it wouldn't.
Jer Noble
Comment 27
2015-08-13 10:00:49 PDT
(In reply to
comment #26
)
> Btw. my server does support ranges, I just checked, so I don't know why you > would say it wouldn't.
> curl -I -r 0-100 '
http://codeflow.org/issues/slow_video_to_texture/mp4/720p.mp4
'
HTTP/1.1 200 OK Content-Length: 1446854 Expires: Thu, 13 Aug 2015 16:59:46 GMT Server: ftlpy 0.1 Connection: Keep-Alive ETag: 28d3d94bd36539de2218bd5beafe6d2c Cache-Control: private, max-age=0, no-cache, no-transform Date: Thu, 13 Aug 2015 16:59:46 GMT Content-Type: video/mp4 If your server supported range requests, it would have returned a '206 Partial Content' response, not a '200 OK'.
Jer Noble
Comment 28
2015-08-13 10:04:39 PDT
(In reply to
comment #24
)
> Comment on
attachment 252893
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=252893&action=review
> > > Source/WebCore/platform/graphics/avfoundation/objc/MediaPlayerPrivateAVFoundationObjC.h:242 > > + virtual bool copyVideoTextureToPlatformTexture(GraphicsContext3D*, Platform3DObject, GC3Denum target, GC3Dint level, GC3Denum internalFormat, GC3Denum format, GC3Denum type, bool premultiplyAlpha, bool flipY) override; > > Nit: don't need both "virtual" and "override" (the later implies the former).
Ok.
> > Source/WebCore/platform/graphics/avfoundation/objc/MediaPlayerPrivateAVFoundationObjC.mm:2370 > > +void MediaPlayerPrivateAVFoundationObjC::destroyOpenGLVideoOutput() > > This is never called.
Well, that's a mistake! :)
> > Source/WebCore/platform/graphics/avfoundation/objc/MediaPlayerPrivateAVFoundationObjC.mm:2382 > > +bool MediaPlayerPrivateAVFoundationObjC::openGLVideoOutputHasAvailableFrame() > > Nit: this and updateLastOpenGLImage should take the item time instead of > both calling -[AVPlayerItem currentTime]
Actually, see below.
> > Source/WebCore/platform/graphics/avfoundation/objc/MediaPlayerPrivateAVFoundationObjC.mm:2406 > > +void MediaPlayerPrivateAVFoundationObjC::updateLastOpenGLImage() > > +{ > > + if (!m_openGLVideoOutput) > > + return; > > + > > + CMTime currentTime = [m_avPlayerItem currentTime]; > > + if (![m_openGLVideoOutput hasNewPixelBufferForItemTime:currentTime]) > > + return; > > + > > + m_lastOpenGLImage = adoptCF([m_openGLVideoOutput copyPixelBufferForItemTime:currentTime itemTimeForDisplay:nil]); > > +} > > Nit: this seems unnecessary - it is never called if m_openGLVideoOutput is > null or there isn't a pixel buffer for the current time. Why not just use > copyPixelBufferForItemTime:itemTimeForDisplay: directly?
Maybe we shouldn't have a openGLVideoOutputHasAvailableFrame(), and should instead just call updateLastOpenGLImage() and check to see if m_lastOpenGLImage is non-NULL.
Florian Bösch
Comment 29
2015-08-13 10:06:38 PDT
(In reply to
comment #27
)
> (In reply to
comment #26
) > > Btw. my server does support ranges, I just checked, so I don't know why you > > would say it wouldn't. > > > curl -I -r 0-100 '
http://codeflow.org/issues/slow_video_to_texture/mp4/720p.mp4
' > HTTP/1.1 200 OK > Content-Length: 1446854 > Expires: Thu, 13 Aug 2015 16:59:46 GMT > Server: ftlpy 0.1 > Connection: Keep-Alive > ETag: 28d3d94bd36539de2218bd5beafe6d2c > Cache-Control: private, max-age=0, no-cache, no-transform > Date: Thu, 13 Aug 2015 16:59:46 GMT > Content-Type: video/mp4 > > If your server supported range requests, it would have returned a '206 > Partial Content' response, not a '200 OK'.
GET /issues/slow_video_to_texture/mp4/720p.mp4 HTTP/1.1 Host: codeflow.org Connection: keep-alive Pragma: no-cache Cache-Control: no-cache Accept-Encoding: identity;q=1, *;q=0 User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.155 Safari/537.36 Accept: */* Referer:
http://codeflow.org/issues/slow_video_to_texture/mp4/720p.mp4
Accept-Language: en-US,en;q=0.8,de;q=0.6 Cookie: __utma=240947345.1694194387.1439291172.1439291172.1439291172.1; __utmc=240947345; __utmz=240947345.1439291172.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none) Range: bytes=0- HTTP/1.1 206 Partial Content Content-Length: 1446854 Accept-Ranges: bytes Expires: Thu, 13 Aug 2015 17:04:42 GMT Content-Range: bytes 0-1446853/1446854 Server: ftlpy 0.1 Connection: Keep-Alive ETag: 28d3d94bd36539de2218bd5beafe6d2c Cache-Control: private, max-age=0, no-cache, no-transform Date: Thu, 13 Aug 2015 17:04:42 GMT Content-Type: video/mp4 It just doesn't do it on HEAD because I've seen no UA do that for video.
Jer Noble
Comment 30
2015-08-13 10:07:32 PDT
(In reply to
comment #25
)
> I won't entertain more discussions of http range or not, that's got nothing > whatsoever todo with tex[Sub]Image2D speed.
Well, except that once this bug is fixed, you'll still need Byte Range support enabled on your server to see it work.
> I have clocked an optimal path for putting the texels of a video into a > texture using a C++ test program quite a while ago (just consult the chrome > ticket that's been open for the last 4 years). > > If you are uploading YCbCr (4:2:2) to the GPU from ram and then convert to > RGB on GPU this should not take more than 2ms (even on fairly slow > machines). And if you already have a YCbCr texture in vram, this will take > approximately 10 microseconds.
That assumes that the image starts off on the CPU. Modern hardware decoders will write directly to the GPU, so your "optimal" path would necessitate a GPU readback, even when the decoded image was already in RGB.
Florian Bösch
Comment 31
2015-08-13 10:09:39 PDT
(In reply to
comment #30
)
> That assumes that the image starts off on the CPU. Modern hardware decoders > will write directly to the GPU, so your "optimal" path would necessitate a > GPU readback, even when the decoded image was already in RGB.
It assumes that you either have it on CPU already, or recognizes that you don't, in which case it's even faster because it's already on the GPU and you can do a GPU -> GPU transfer. In either case, whatever it assumes, implementations of this are bad, and have been, and still are.
Jer Noble
Comment 32
2015-08-13 10:10:20 PDT
(In reply to
comment #29
)
> GET /issues/slow_video_to_texture/mp4/720p.mp4 HTTP/1.1 > Host: codeflow.org > Connection: keep-alive > Pragma: no-cache > Cache-Control: no-cache > Accept-Encoding: identity;q=1, *;q=0 > User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like > Gecko) Chrome/44.0.2403.155 Safari/537.36 > Accept: */* > Referer:
http://codeflow.org/issues/slow_video_to_texture/mp4/720p.mp4
> Accept-Language: en-US,en;q=0.8,de;q=0.6 > Cookie: __utma=240947345.1694194387.1439291172.1439291172.1439291172.1; > __utmc=240947345; > __utmz=240947345.1439291172.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none) > Range: bytes=0- > > HTTP/1.1 206 Partial Content > Content-Length: 1446854 > Accept-Ranges: bytes > Expires: Thu, 13 Aug 2015 17:04:42 GMT > Content-Range: bytes 0-1446853/1446854
Try with a actual sub-range.
> curl -O -r 0-100 '
http://codeflow.org/issues/slow_video_to_texture/mp4/720p.mp4
'
% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1412k 100 1412k 0 0 403k 0 0:00:03 0:00:03 --:--:-- 403k A request for range 0-100 returns the entire file. This is not compliant behavior.
Jer Noble
Comment 33
2015-08-13 10:19:16 PDT
(In reply to
comment #31
)
> (In reply to
comment #30
) > > That assumes that the image starts off on the CPU. Modern hardware decoders > > will write directly to the GPU, so your "optimal" path would necessitate a > > GPU readback, even when the decoded image was already in RGB. > > It assumes that you either have it on CPU already, or recognizes that you > don't, in which case it's even faster because it's already on the GPU and > you can do a GPU -> GPU transfer.
You'll note, in the patch attached to this bug, that's exactly what we'll do.
Jer Noble
Comment 34
2015-08-13 11:07:03 PDT
Here are the results from Florian's test case with the current WIP patch: upload resolution frame/s upload time (ms) texImage2D 240p 57.64 0.89 texImage2D 480p 55.63 1.28 texImage2D 720p 45.82 2.18 texImage2D 1080p 53.94 1.76 texSubImage2D 240p 56.73 1.09 texSubImage2D 480p 36.82 1.06 texSubImage2D 720p 55.74 1.23 texSubImage2D 1080p 54.55 2.06 texImage2D 240p 55.81 1.39 texImage2D 480p 56.29 0.94 texImage2D 720p 55.61 1.14 texImage2D 1080p 54.81 1.83 texSubImage2D 240p 56.26 1.25 texSubImage2D 480p 56.28 0.95 texSubImage2D 720p 55.29 1.39 texSubImage2D 1080p 54.03 2.10 There may be a few more ms of efficiency to squeeze out, but it's generally very much improved.
Klaus Reinfeld
Comment 35
2015-08-13 12:12:01 PDT
That GPU to GPU patch sounds good! Is there a chance that it might be included in the final iOS 9 release? (I know - no comments on future releases, but maybe a 'maybe'? :-))
Florian Bösch
Comment 36
2015-08-13 12:37:42 PDT
(In reply to
comment #34
)
> Here are the results from Florian's test case with the current WIP patch: > > upload resolution frame/s upload time (ms) > texImage2D 240p 57.64 0.89 > texImage2D 480p 55.63 1.28 > texImage2D 720p 45.82 2.18 > texImage2D 1080p 53.94 1.76 > texSubImage2D 240p 56.73 1.09 > texSubImage2D 480p 36.82 1.06 > texSubImage2D 720p 55.74 1.23 > texSubImage2D 1080p 54.55 2.06 > texImage2D 240p 55.81 1.39 > texImage2D 480p 56.29 0.94 > texImage2D 720p 55.61 1.14 > texImage2D 1080p 54.81 1.83 > texSubImage2D 240p 56.26 1.25 > texSubImage2D 480p 56.28 0.95 > texSubImage2D 720p 55.29 1.39 > texSubImage2D 1080p 54.03 2.10 > > There may be a few more ms of efficiency to squeeze out, but it's generally > very much improved.
Why is the FPS not 60 if the upload time is so small? You're probably still taking upwards of 16ms for a video frame upload. I daresay, there's more than a "few" improvements you can squeeze out there, like approximately 1200%.
Jer Noble
Comment 37
2015-08-13 14:02:49 PDT
(In reply to
comment #36
)
> Why is the FPS not 60 if the upload time is so small? You're probably still > taking upwards of 16ms for a video frame upload. I daresay, there's more > than a "few" improvements you can squeeze out there, like approximately > 1200%.
Commenting out the call to texImage2D and re-running the test least to frame rates which hover around 57fps; so there is some inefficiency outside of video texture upload, and out of scope of this bug, to figure out.
Florian Bösch
Comment 38
2015-08-13 14:05:56 PDT
(In reply to
comment #37
)
> (In reply to
comment #36
) > > Why is the FPS not 60 if the upload time is so small? You're probably still > > taking upwards of 16ms for a video frame upload. I daresay, there's more > > than a "few" improvements you can squeeze out there, like approximately > > 1200%. > > Commenting out the call to texImage2D and re-running the test least to frame > rates which hover around 57fps; so there is some inefficiency outside of > video texture upload, and out of scope of this bug, to figure out.
There's a stripped down simple version of the test here:
http://codeflow.org/issues/slow_video_to_texture/simple.html
that you can use to compare against the more complex one.
Jer Noble
Comment 39
2015-08-13 14:35:01 PDT
If you modify your Statistics object to return a weighted moving average (that gives more weight to recent entries), the FPS averages for each test rapidly converge on ~59.7 FPS. This implies that the slowness is almost entirely in startup of each test. And indeed, we have some code which, if no GPU frames are available, blocks until a CPU frame can be generated. This was originally added for (canvas and WebGL) correctness reasons; and we are indeed hitting that path at the beginning of each of your tests.
Florian Bösch
Comment 40
2015-08-13 22:01:40 PDT
(In reply to
comment #39
)
> If you modify your Statistics object to return a weighted moving average > (that gives more weight to recent entries), the FPS averages for each test > rapidly converge on ~59.7 FPS. This implies that the slowness is almost > entirely in startup of each test. And indeed, we have some code which, if no > GPU frames are available, blocks until a CPU frame can be generated. This > was originally added for (canvas and WebGL) correctness reasons; and we are > indeed hitting that path at the beginning of each of your tests.
You can modify the simple test to use a rolling average or start measuring only N frames in or something. Anyway, if you want to know how much better you've gotten beyond satisfying 60fps and a low upload time (which is somewhat unreliable because of asynchronicity) you should repeat the upload multiple times until you drop FPS.
Jer Noble
Comment 41
2015-08-20 16:58:05 PDT
Created
attachment 259527
[details]
Patch
Jer Noble
Comment 42
2015-08-21 09:53:19 PDT
Created
attachment 259622
[details]
Patch
Build Bot
Comment 43
2015-08-21 10:30:31 PDT
Comment on
attachment 259622
[details]
Patch
Attachment 259622
[details]
did not pass mac-ews (mac): Output:
http://webkit-queues.webkit.org/results/86322
New failing tests: webgl/1.0.2/conformance/textures/gl-teximage.html webgl/1.0.2/conformance/textures/texture-npot-video.html fast/canvas/webgl/gl-teximage.html webgl/1.0.2/conformance/context/premultiplyalpha-test.html fast/canvas/webgl/premultiplyalpha-test.html
Build Bot
Comment 44
2015-08-21 10:30:33 PDT
Created
attachment 259627
[details]
Archive of layout-test-results from ews101 for mac-mavericks The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews101 Port: mac-mavericks Platform: Mac OS X 10.9.5
Build Bot
Comment 45
2015-08-21 10:31:15 PDT
Comment on
attachment 259622
[details]
Patch
Attachment 259622
[details]
did not pass mac-wk2-ews (mac-wk2): Output:
http://webkit-queues.webkit.org/results/86326
New failing tests: webgl/1.0.2/conformance/textures/gl-teximage.html webgl/1.0.2/conformance/textures/texture-npot-video.html fast/canvas/webgl/gl-teximage.html webgl/1.0.2/conformance/context/premultiplyalpha-test.html fast/canvas/webgl/premultiplyalpha-test.html
Build Bot
Comment 46
2015-08-21 10:31:18 PDT
Created
attachment 259629
[details]
Archive of layout-test-results from ews104 for mac-mavericks-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews104 Port: mac-mavericks-wk2 Platform: Mac OS X 10.9.5
Jer Noble
Comment 47
2015-09-25 15:34:42 PDT
Created
attachment 261948
[details]
Patch
Build Bot
Comment 48
2015-09-25 15:56:45 PDT
Comment on
attachment 261948
[details]
Patch
Attachment 261948
[details]
did not pass mac-ews (mac): Output:
http://webkit-queues.webkit.org/results/212076
New failing tests: webgl/1.0.2/conformance/textures/texture-npot-video.html
Build Bot
Comment 49
2015-09-25 15:56:48 PDT
Created
attachment 261950
[details]
Archive of layout-test-results from ews100 for mac-mavericks The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews100 Port: mac-mavericks Platform: Mac OS X 10.9.5
Build Bot
Comment 50
2015-09-25 16:16:03 PDT
Comment on
attachment 261948
[details]
Patch
Attachment 261948
[details]
did not pass mac-wk2-ews (mac-wk2): Output:
http://webkit-queues.webkit.org/results/212104
New failing tests: webgl/1.0.2/conformance/textures/texture-npot-video.html
Build Bot
Comment 51
2015-09-25 16:16:10 PDT
Created
attachment 261952
[details]
Archive of layout-test-results from ews104 for mac-mavericks-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews104 Port: mac-mavericks-wk2 Platform: Mac OS X 10.9.5
Jer Noble
Comment 52
2015-09-25 23:23:14 PDT
Created
attachment 261963
[details]
Patch Finally figured out the failing test; I wasn't handling the flipY = true case. This should finally make the EWS bots happy.
Jer Noble
Comment 53
2015-09-28 13:11:13 PDT
Created
attachment 262015
[details]
Patch
Dean Jackson
Comment 54
2015-11-13 09:52:24 PST
Comment on
attachment 262015
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=262015&action=review
> Source/WebCore/ChangeLog:9 > + when lazily-created, will emit CVPixelBuffers which are guaranteed to be compatable with
Typo: compatible
> Source/WebCore/platform/graphics/avfoundation/objc/MediaPlayerPrivateAVFoundationObjC.mm:2430 > +} > + > + > +#if !LOG_DISABLED
Extra blank line.
> Source/WebCore/platform/graphics/avfoundation/objc/MediaPlayerPrivateAVFoundationObjC.mm:2566 > + size_t height = CVPixelBufferGetHeight(m_lastOpenGLImage.get()); > + > + > +#if PLATFORM(IOS)
Another extra line.
> Source/WebCore/platform/graphics/avfoundation/objc/MediaPlayerPrivateAVFoundationObjC.mm:2597 > + GC3Denum readFramebufferTarget = GraphicsContext3D::READ_FRAMEBUFFER; > + GC3Denum readFramebufferBinding = GraphicsContext3D::READ_FRAMEBUFFER_BINDING;
I'm not sure it makes it any clearer to have these variables around that point to constants. You never change their values. It might be fine to just explicitly include them as parameters when you use them.
Dean Jackson
Comment 55
2015-11-13 09:59:29 PST
> texSubImage2D 480p 36.82 1.06 > texSubImage2D 720p 55.74 1.23
Why is 480p slower than 720p? Is it because the test doesn't converge in time? It would be nice if this patch could be automatically tested. However, given the state of our testing, I doubt we could get reliable results.
Simon Fraser (smfr)
Comment 56
2015-11-13 11:10:27 PST
We should be able to make an animometer benchmark for this.
Jer Noble
Comment 57
2015-11-13 14:41:22 PST
(In reply to
comment #55
)
> > texSubImage2D 480p 36.82 1.06 > > texSubImage2D 720p 55.74 1.23 > > Why is 480p slower than 720p? Is it because the test doesn't converge in > time? > > It would be nice if this patch could be automatically tested. However, given > the state of our testing, I doubt we could get reliable results.
See the comments following. The frame rate averages are thrown off greatly by the first few frames, which are, at that point, still using the software painting path. We need time to spin up the AVPlayerItemOutput which is capable of generating OpenGL-compatable buffers. If you use a weighted moving-average, you see that we hover right at 60 fps during steady-state.
Jer Noble
Comment 58
2015-11-18 16:40:40 PST
Created
attachment 265807
[details]
Patch for landing
WebKit Commit Bot
Comment 59
2015-11-18 19:20:16 PST
Comment on
attachment 265807
[details]
Patch for landing Clearing flags on attachment: 265807 Committed
r192607
: <
http://trac.webkit.org/changeset/192607
>
Radar WebKit Bug Importer
Comment 60
2015-11-19 13:18:56 PST
<
rdar://problem/23617699
>
Csaba Osztrogonác
Comment 61
2015-11-23 03:38:13 PST
It is already landed. Can we close the bug?
Jer Noble
Comment 62
2015-12-02 08:18:45 PST
Yep!
Klaus Reinfeld
Comment 63
2016-01-18 05:45:06 PST
Landed/fixed but not included in iOS 9.3? The performance in iOS 9.3 beta is still very bad...
Freddy Snijder
Comment 64
2016-06-24 07:42:39 PDT
Hello, Does anyone know if the patch landed in Safari iOS 9.3.2? (This there a way for me to find out, other then observation?) Thank you! -- Freddy Snijder
Klaus Reinfeld
Comment 65
2016-09-09 03:37:01 PDT
Has anyone tested this patch with iOS 10? That new 'Support a direct GPU-to-GPU copy of video textures' commit from here:
https://github.com/WebKit/webkit/commit/9abf87df6ed4f5b57e5770c993609962f6815625#diff-a8c799bd9a6162420acfd9bbb5b35d44
has huge problems on iOS 10! When using RGB (not RGBA) textures: - an iPad Pro 12.9 is directly crashing - iPad Air 1 and iPhone 6+ are getting 'INVALID FRAMEBUFFER OPERATION' errors Here a test case:
http://krpano.com/ios/bugs/ios10-webgl-video-texture-crash/
See the page source code (very simple) for details. The example also includes several workaround solutions.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug