WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
NEW
260709
[GTK][WPE] Add Whisper.cpp as a submodule
https://bugs.webkit.org/show_bug.cgi?id=260709
Summary
[GTK][WPE] Add Whisper.cpp as a submodule
ChangSeok Oh
Reported
2023-08-25 02:10:58 PDT
We add Whisper.cpp to WebKitGTK/WPE builds, which is a speech recognition library.
Attachments
Add attachment
proposed patch, testcase, etc.
Michael Catanzaro
Comment 1
2023-08-25 05:48:59 PDT
Is there a really good reason this cannot be a system dependency? Anything that prevents it from being packaged by distros? We should reduce the number of bundled third-party sources, not add more.
Philippe Normand
Comment 2
2023-08-25 06:28:30 PDT
(In reply to Michael Catanzaro from
comment #1
)
> Is there a really good reason this cannot be a system dependency? Anything > that prevents it from being packaged by distros? We should reduce the number > of bundled third-party sources, not add more.
I don't think there is any distro packages for this. OTOH, it's two C files...
Michael Catanzaro
Comment 3
2023-08-25 06:57:45 PDT
> OTOH, it's two C files...
Ah, I would just copy the needed files then. A submodule would pull in a big repo
https://github.com/ggerganov/whisper.cpp
.
Philippe Normand
Comment 4
2023-08-25 07:16:42 PDT
(In reply to Michael Catanzaro from
comment #3
)
> > OTOH, it's two C files... > > Ah, I would just copy the needed files then. A submodule would pull in a big > repo
https://github.com/ggerganov/whisper.cpp
.
Is that a big issue?
Philippe Normand
Comment 5
2023-08-25 07:17:08 PDT
A submodule would be easier to sync later on.
ChangSeok Oh
Comment 6
2023-08-25 09:45:19 PDT
(In reply to Michael Catanzaro from
comment #3
)
> > OTOH, it's two C files... > > Ah, I would just copy the needed files then. A submodule would pull in a big > repo
https://github.com/ggerganov/whisper.cpp
.
Whisper.cpp APIs often change. I am not sure if we can track and control the used version by copying files.
ChangSeok Oh
Comment 7
2023-08-25 10:00:57 PDT
(In reply to Michael Catanzaro from
comment #1
)
> Is there a really good reason this cannot be a system dependency? Anything > that prevents it from being packaged by distros? We should reduce the number > of bundled third-party sources, not add more.
I attempted to bundle whisper.cpp into the SDK. In that case, we need to enclose GB-sized model files as well. I have no strong preference on how we bring whisper.cpp into webkit, but adding it as a submodule would be cheaper in terms of maintenance burden (e.g., SDK size). One concern is integration with webkit tests. Can we make the test bots download whisper.cpp repo and related language models files automatically? If not, speech recognition for webkitgtk/wpe is always disabled for the build, and manually enabled on demand, no?
Philippe Normand
Comment 8
2023-08-25 10:53:29 PDT
There's a mock test infrastructure for this IIRC. If we want to test the non-mock code, I guess we could teach cmake to download models, when in developer mode.
Michael Catanzaro
Comment 9
2023-08-25 11:14:46 PDT
(In reply to ChangSeok Oh from
comment #6
)
> Whisper.cpp APIs often change. I am not sure if we can track and control the > used version by copying files.
Well then I suppose it cannot be a system dependency. :( It's generally best to not depend on such software at all and avoid the trouble, but if this is really the only suitable option, then OK.... (In reply to ChangSeok Oh from
comment #7
)
> I attempted to bundle whisper.cpp into the SDK. In that case, we need to > enclose GB-sized model files as well. I have no strong preference on how we > bring whisper.cpp into webkit, but adding it as a submodule would be cheaper > in terms of maintenance burden (e.g., SDK size).
I don't understand. If these model files are required, then they have to be shipped no matter what, right? With the WebKit SDK, the model files would be part of the application. With the GNOME SDK, WebKit itself is part of the SDK, so it becomes part of the SDK. The WebKit SDK is only used by WebKit developers. Users will be using the GNOME SDK. So it winds up in the SDK in the end, yes?
> One concern is integration with webkit tests. Can we make the test bots > download whisper.cpp repo and related language models files automatically? > If not, speech recognition for webkitgtk/wpe is always disabled for the > build, and manually enabled on demand, no?
We probably don't want to make tests bots do anything special like this, because that wouldn't reflect what distros do. There is no way distros are going to deal with downloading whisper.cpp themselves as part of the WebKit build. It needs to be either (a) a system dependency (preinstalled on the bots, packaged separately by distros), or else (b) included in WebKit tarballs generated by Tools/make-dist.
Philippe Normand
Comment 10
2023-08-25 11:36:11 PDT
The models are potentially big in size, don't necessarily have the same license as the whisper lib, and there's one per locale (AFAIK), so I don't think they can all be shipped by the SDK, doesn't seem realistic to me... For Flatpak apps, maybe they could ship as Flatpak extensions though.
Carlos Alberto Lopez Perez
Comment 11
2023-08-25 13:26:03 PDT
Using git submodules or shipping it on the Flatpak SDK doesn't solve the fundamental problem of: - How the end users that use release tarballs get this software? * If is via us then it has to be copied inside the WebKit source directory and we need to ensure to ship it on the release tarballs * If is via a third party then we rely on the distros to package and ship this. Using git submodules is not going to be helpeful if the distros don't ship it.
ChangSeok Oh
Comment 12
2023-08-25 15:14:12 PDT
(In reply to Michael Catanzaro from
comment #9
)
> (In reply to ChangSeok Oh from
comment #7
) > > I attempted to bundle whisper.cpp into the SDK. In that case, we need to > > enclose GB-sized model files as well. I have no strong preference on how we > > bring whisper.cpp into webkit, but adding it as a submodule would be cheaper > > in terms of maintenance burden (e.g., SDK size). > > I don't understand. If these model files are required, then they have to be > shipped no matter what, right? With the WebKit SDK, the model files would be > part of the application. With the GNOME SDK, WebKit itself is part of the > SDK, so it becomes part of the SDK.
I might mislead you. Clearly speaking, the WebKit SDK does not necessarily need to contain model files. They are runtime requirements of whisper.cpp. My initial intention was to ship code for speech recognition (including two whisper.cpp source files) as a webkitgtk tarball and let users download model files on their needs. If the user is GNOME SDK, the GNOME distributor puts the model files into their SDK and indicates the location of the model file with a given environment variable. Having whisper.cpp as a system package would be the best. But we don't have the package. This is a temporary approach until we have it.
> > The WebKit SDK is only used by WebKit developers. Users will be using the > GNOME SDK. So it winds up in the SDK in the end, yes?
What about shipping part of the model files in the WebKit SDK, then? For instance, the size of ggml-base.en.bin is 141MB. A smaller one is 75MB.
https://github.com/ggerganov/whisper.cpp#memory-usage
ChangSeok Oh
Comment 13
2023-08-28 17:21:25 PDT
Pull request:
https://github.com/WebKit/WebKit/pull/17158
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug