Created attachment 459040 [details] Capture video from a webcam and show effective stream configuration Video streams captured from a webcam using the GStreamer API are limited to the raw video type only. Some webcams may offer higher resolutions or frame rates with encoded streams (like image/jpeg or video/x-h264 for example). Those resolutions and frame rates should be taken into account while choosing the webcam configuration. Let's take for example a specific webcam with those characteristics: video/x-raw, width=1280, height=720, framerate=10/1 image/jpeg, width=1280, height=720, framerate=30/1 When calling `navigator.mediaDevices.getUserMedia(constraints)` with those constraints: ``` video: { width: 1280, height: 720, frameRate: 30 } ``` The captured video stream configuration is 1280x720@10fps whereas the expected result should be 1280x720@30fps. The attached index.html file try to capture a video stream from a webcam at 1280x720@30fps and shows the characteristics of the effective video stream configuration.
Created attachment 459042 [details] Patch
Comment on attachment 459042 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=459042&action=review > Source/WebCore/platform/mediastream/gstreamer/GStreamerVideoCapturer.cpp:158 > + std::string mimeType; We don't use std::string much here, we use String, AtomString, CString from WTF::. I wonder though could this one be const char*? > Source/WebCore/platform/mediastream/gstreamer/GStreamerVideoCapturer.cpp:182 > + selector.stopCondition.frameRate = static_cast<double>(numerator) / static_cast<double>(denominator); gst_util_fraction_to_double() > Source/WebCore/platform/mediastream/gstreamer/GStreamerVideoCapturer.cpp:198 > + if (!gst_structure_get_int(structure, "width", &width) > + || !gst_structure_get_int(structure, "height", &height) > + || !gst_structure_get_fraction(structure, "framerate", &numerator, &denominator)) One line, you can go up to 120 chars per line. > Source/WebCore/platform/mediastream/gstreamer/GStreamerVideoCapturer.cpp:201 > + double frameRate = static_cast<double>(numerator) / static_cast<double>(denominator); Ditto > Source/WebCore/platform/mediastream/gstreamer/GStreamerVideoCapturer.cpp:204 > + if ((width >= selector->stopCondition.width) > + && (height >= selector->stopCondition.height) > + && (frameRate >= selector->stopCondition.frameRate)) { One line and no need for so many () I think? > Source/WebCore/platform/mediastream/gstreamer/GStreamerVideoCapturer.cpp:211 > + if ((width >= selector->maxWidth) > + && (height >= selector->maxHeight) > + && (frameRate >= selector->maxFrameRate)) { Ditto
Comment on attachment 459042 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=459042&action=review >> Source/WebCore/platform/mediastream/gstreamer/GStreamerVideoCapturer.cpp:158 >> + std::string mimeType; > > We don't use std::string much here, we use String, AtomString, CString from WTF::. I wonder though could this one be const char*? Effectively, the string memory is hold by deviceCaps which is only destroyed when the function returns, so it can perfectly be a simple const char*. Thanks for seeing it ;) >> Source/WebCore/platform/mediastream/gstreamer/GStreamerVideoCapturer.cpp:182 >> + selector.stopCondition.frameRate = static_cast<double>(numerator) / static_cast<double>(denominator); > > gst_util_fraction_to_double() done >> Source/WebCore/platform/mediastream/gstreamer/GStreamerVideoCapturer.cpp:198 >> + || !gst_structure_get_fraction(structure, "framerate", &numerator, &denominator)) > > One line, you can go up to 120 chars per line. Here it takes more than 120 chars if you try to group the conditions together on the same line. >> Source/WebCore/platform/mediastream/gstreamer/GStreamerVideoCapturer.cpp:201 >> + double frameRate = static_cast<double>(numerator) / static_cast<double>(denominator); > > Ditto done >> Source/WebCore/platform/mediastream/gstreamer/GStreamerVideoCapturer.cpp:204 >> + && (frameRate >= selector->stopCondition.frameRate)) { > > One line and no need for so many () I think? done >> Source/WebCore/platform/mediastream/gstreamer/GStreamerVideoCapturer.cpp:211 >> + && (frameRate >= selector->maxFrameRate)) { > > Ditto done
Created attachment 459121 [details] Patch
Created attachment 459122 [details] Patch
Comment on attachment 459122 [details] Patch LGTM, thanks!
I forgot to take into account the caps ranges in the algo. I need to correct it and I will file a new patch.
OK, good thing I haven't set commit-queue+ on this then :)
(In reply to Philippe Normand from comment #8) > OK, good thing I haven't set commit-queue+ on this then :) You can update this patch if you prefer, since it hasn't landed yet.
Created attachment 459216 [details] Patch
The latest patch corrects what was lacking previously. It has been tested on two different machines, one of it having ranges and arrays for device caps to be sure it now takes correctly into account those caps. It also corrects another issue with ghost pads that were not correctly connected inside the converter bin.
Comment on attachment 459216 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=459216&action=review > Source/WebCore/platform/mediastream/gstreamer/GStreamerVideoCapturer.cpp:71 > + auto bin = makeGStreamerBin("capsfilter caps=\"video/x-raw\" name=mimetype-filter ! decodebin3 ! videoscale ! videoconvert ! videorate drop-only=1 average-period=1 name=videorate", false); So passing true as second argument here wasn't working? Was it due to decodebin3?
Comment on attachment 459216 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=459216&action=review >> Source/WebCore/platform/mediastream/gstreamer/GStreamerVideoCapturer.cpp:71 >> + auto bin = makeGStreamerBin("capsfilter caps=\"video/x-raw\" name=mimetype-filter ! decodebin3 ! videoscale ! videoconvert ! videorate drop-only=1 average-period=1 name=videorate", false); > > So passing true as second argument here wasn't working? Was it due to decodebin3? Yes, this is because of decobin3. As src pad is a "sometimes" pad the automatic ghost pad was connected to the videoscale right after and so the mimetype was correctly forced on the first capsfilter but it was useless as the bin src pad was bypassing it.
Comment on attachment 459216 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=459216&action=review >>> Source/WebCore/platform/mediastream/gstreamer/GStreamerVideoCapturer.cpp:71 >>> + auto bin = makeGStreamerBin("capsfilter caps=\"video/x-raw\" name=mimetype-filter ! decodebin3 ! videoscale ! videoconvert ! videorate drop-only=1 average-period=1 name=videorate", false); >> >> So passing true as second argument here wasn't working? Was it due to decodebin3? > > Yes, this is because of decobin3. > > As src pad is a "sometimes" pad the automatic ghost pad was connected to the videoscale right after and so the mimetype was correctly forced on the first capsfilter but it was useless as the bin src pad was bypassing it. Ok then. Maybe later we could rework this part a bit, constructing the bin manually and handling db3 pads with the pad-added signal. But if this approach works fine let's land it :)
Committed r294104 (250489@main): <https://commits.webkit.org/250489@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 459216 [details].
<rdar://problem/93179669>
Introduced regression: https://bugs.webkit.org/show_bug.cgi?id=240420
So, this kind-of introduces a new CPU perf issue. For instance my laptop webcam has these capabilities: name : Integrated_Webcam_HD (V4L2) class : Video/Source caps : image/jpeg, width=1280, height=720, framerate=30/1 image/jpeg, width=960, height=540, framerate=30/1 image/jpeg, width=848, height=480, framerate=30/1 image/jpeg, width=640, height=480, framerate=30/1 image/jpeg, width=640, height=360, framerate=30/1 video/x-raw, format=YUY2, width=640, height=480, framerate=30/1 video/x-raw, format=YUY2, width=640, height=360, framerate=30/1 video/x-raw, format=YUY2, width=424, height=240, framerate=30/1 video/x-raw, format=YUY2, width=320, height=240, framerate=30/1 video/x-raw, format=YUY2, width=320, height=180, framerate=30/1 video/x-raw, format=YUY2, width=160, height=120, framerate=30/1 image/jpeg is chosen when we look for a 640x480 resolution, so we have to decode this, in software, before then re-encoding to whatever was negotiated with WebRTC. For this case I think it'd be better to select the video/x-raw caps.
Maybe we should prefer the accelerated jpeg decoders, I'm setting this which seems to work reasonably well on slower systems with vaapi compatible intel GPU's: GST_PLUGIN_FEATURE_RANK=vajpegdec:PRIMARY,jpegparse:PRIMARY,jpegdec:SECONDARY
(In reply to James Hilliard from comment #19) > Maybe we should prefer the accelerated jpeg decoders, Maybe... Our capture handling is not great, TBH :( We should decode only when needed. > I'm setting this which > seems to work reasonably well on slower systems with vaapi compatible intel > GPU's: > GST_PLUGIN_FEATURE_RANK=vajpegdec:PRIMARY,jpegparse:PRIMARY,jpegdec:SECONDARY My GPU doesn't seem to support JPEG...
Hmm, which GPU does your system have? It's sometimes a bit tricky to get the jpeg acceleration working.