Bug 227034 - Use more SQL transactions in ResourceLoadStatisticsDatabaseStore
Summary: Use more SQL transactions in ResourceLoadStatisticsDatabaseStore
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit2 (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Chris Dumez
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2021-06-15 10:27 PDT by Chris Dumez
Modified: 2021-06-15 17:57 PDT (History)
5 users (show)

See Also:


Attachments
Patch (7.89 KB, patch)
2021-06-15 10:30 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff
Patch (29.24 KB, patch)
2021-06-15 11:03 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff
Patch (29.71 KB, patch)
2021-06-15 12:21 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff
Patch (35.77 KB, patch)
2021-06-15 12:54 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff
Patch (35.80 KB, patch)
2021-06-15 13:01 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff
Patch (53.03 KB, patch)
2021-06-15 13:30 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff
Patch (52.79 KB, patch)
2021-06-15 13:31 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff
Patch (53.53 KB, patch)
2021-06-15 16:16 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Dumez 2021-06-15 10:27:15 PDT
Use more SQL transactions in ResourceLoadStatisticsDatabaseStore, for performance.
Comment 1 Chris Dumez 2021-06-15 10:30:06 PDT
Created attachment 431454 [details]
Patch
Comment 2 Chris Dumez 2021-06-15 11:03:12 PDT
Created attachment 431457 [details]
Patch
Comment 3 Chris Dumez 2021-06-15 12:21:47 PDT
Created attachment 431465 [details]
Patch
Comment 4 Chris Dumez 2021-06-15 12:54:14 PDT
Created attachment 431468 [details]
Patch
Comment 5 Chris Dumez 2021-06-15 13:01:29 PDT
Created attachment 431470 [details]
Patch
Comment 6 Chris Dumez 2021-06-15 13:30:44 PDT
Created attachment 431474 [details]
Patch
Comment 7 Chris Dumez 2021-06-15 13:31:54 PDT
Created attachment 431475 [details]
Patch
Comment 8 Kate Cheney 2021-06-15 15:49:09 PDT
Comment on attachment 431475 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=431475&action=review

I think this is a good idea. I'm curious about the pattern. Do we want all functions that interact with the database to use transactions, or is it as-needed for now to improve performance?

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:636
> +        if (!transaction.inProgress())

It doesn't look like the code does anything if you call begin() on a transaction already in progress, you could probably remove this if you wanted.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1039
> +String ResourceLoadStatisticsDatabaseStore::ensureAndMakeDomainList(const SQLiteTransaction&, const HashSet<RegistrableDomain>& domainList)

Did you mean to have ASSERT(m_database.transactionInProgress()); here? Otherwise I am not sure why we should include SQLiteTransaction in this function, unless it is to avoid unused parameter errors in release builds, but in that case I think it can be done in insertDomainRelationshipList.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1158
>          updateDataRecordsRemoved(other.registrableDomain, other.dataRecordsRemoved);

Is there a pattern for which functions ASSERT transactions in progress? I notice a few functions in here that don't have them, like updateDataRecordsRemoved and setGrandfathered.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1457
> +                transaction.begin();

Ditto, I think you can just call transaction.begin() with no consequences if it's already in progress.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1761
> +                if (!transaction.inProgress())

Ditto.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1783
> +            if (!transaction.inProgress())

Ditto.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:2248
>      auto result = ensureResourceStatisticsForRegistrableDomain(subresourceDomain);

I wonder if we want to pass transaction to ensureResourceStatisticsForRegistrableDomain and have a check in there for a transaction in progress in the function to match your existing pattern?

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:3172
>      updateOperatingDatesParameters();

updateOperatingDatesParameters() also interacts with the database, maybe we should move transaction.commit(); below that and pass transaction into the updateOperatingDatesParameters() function.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:3229
>      updateOperatingDatesParameters();

Ditto.
Comment 9 Chris Dumez 2021-06-15 16:02:05 PDT
(In reply to katherine_cheney from comment #8)
> Comment on attachment 431475 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=431475&action=review
> 
> I think this is a good idea. I'm curious about the pattern. Do we want all
> functions that interact with the database to use transactions, or is it
> as-needed for now to improve performance?

The pattern is that you should be using a transaction if you're doing several consecutive *write* statements.

> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:636
> > +        if (!transaction.inProgress())
> 
> It doesn't look like the code does anything if you call begin() on a
> transaction already in progress, you could probably remove this if you
> wanted.

I know but I felt this looked clearer and repeatedly calling begin(). Also transaction.inProgress() is very cheap since it is inlined.

> 
> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1039
> > +String ResourceLoadStatisticsDatabaseStore::ensureAndMakeDomainList(const SQLiteTransaction&, const HashSet<RegistrableDomain>& domainList)
> 
> Did you mean to have ASSERT(m_database.transactionInProgress()); here?
> Otherwise I am not sure why we should include SQLiteTransaction in this
> function, unless it is to avoid unused parameter errors in release builds,
> but in that case I think it can be done in insertDomainRelationshipList.

Yes, we should likely ASSERT(m_database.transactionInProgress());. I'll add it.

> 
> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1158
> >          updateDataRecordsRemoved(other.registrableDomain, other.dataRecordsRemoved);
> 
> Is there a pattern for which functions ASSERT transactions in progress? I
> notice a few functions in here that don't have them, like
> updateDataRecordsRemoved and setGrandfathered.

I'll double check those two. I usually ASSERT there is a transaction is in progress if more than one write statement happens in a function and the function doesn't start a transaction (because the caller already has one). In such cases, I also add a SQLiteTransaction parameter to make it clear the caller needs to start a transaction.

> 
> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1457
> > +                transaction.begin();
> 
> Ditto, I think you can just call transaction.begin() with no consequences if
> it's already in progress.
> 
> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1761
> > +                if (!transaction.inProgress())
> 
> Ditto.
> 
> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1783
> > +            if (!transaction.inProgress())
> 
> Ditto.
> 
> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:2248
> >      auto result = ensureResourceStatisticsForRegistrableDomain(subresourceDomain);
> 
> I wonder if we want to pass transaction to
> ensureResourceStatisticsForRegistrableDomain and have a check in there for a
> transaction in progress in the function to match your existing pattern?

ensureResourceStatisticsForRegistrableDomain() only does a single write statement so it doesn't technically require a transaction. Also, some of the call sites are not writers but readers so they don't currently have a transaction (even after my patch).

> 
> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:3172
> >      updateOperatingDatesParameters();
> 
> updateOperatingDatesParameters() also interacts with the database, maybe we
> should move transaction.commit(); below that and pass transaction into the
> updateOperatingDatesParameters() function.
> 
> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:3229
> >      updateOperatingDatesParameters();
> 
> Ditto.

Ok, I'll double check.
Comment 10 Chris Dumez 2021-06-15 16:07:00 PDT
(In reply to Chris Dumez from comment #9)
> (In reply to katherine_cheney from comment #8)
> > Comment on attachment 431475 [details]
> > Patch
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=431475&action=review
> > 
> > I think this is a good idea. I'm curious about the pattern. Do we want all
> > functions that interact with the database to use transactions, or is it
> > as-needed for now to improve performance?
> 
> The pattern is that you should be using a transaction if you're doing
> several consecutive *write* statements.
> 
> > > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:636
> > > +        if (!transaction.inProgress())
> > 
> > It doesn't look like the code does anything if you call begin() on a
> > transaction already in progress, you could probably remove this if you
> > wanted.
> 
> I know but I felt this looked clearer and repeatedly calling begin(). Also
> transaction.inProgress() is very cheap since it is inlined.
> 
> > 
> > > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1039
> > > +String ResourceLoadStatisticsDatabaseStore::ensureAndMakeDomainList(const SQLiteTransaction&, const HashSet<RegistrableDomain>& domainList)
> > 
> > Did you mean to have ASSERT(m_database.transactionInProgress()); here?
> > Otherwise I am not sure why we should include SQLiteTransaction in this
> > function, unless it is to avoid unused parameter errors in release builds,
> > but in that case I think it can be done in insertDomainRelationshipList.
> 
> Yes, we should likely ASSERT(m_database.transactionInProgress());. I'll add
> it.
> 
> > 
> > > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1158
> > >          updateDataRecordsRemoved(other.registrableDomain, other.dataRecordsRemoved);
> > 
> > Is there a pattern for which functions ASSERT transactions in progress? I
> > notice a few functions in here that don't have them, like
> > updateDataRecordsRemoved and setGrandfathered.
> 
> I'll double check those two. I usually ASSERT there is a transaction is in
> progress if more than one write statement happens in a function and the
> function doesn't start a transaction (because the caller already has one).
> In such cases, I also add a SQLiteTransaction parameter to make it clear the
> caller needs to start a transaction.
> 
> > 
> > > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1457
> > > +                transaction.begin();
> > 
> > Ditto, I think you can just call transaction.begin() with no consequences if
> > it's already in progress.
> > 
> > > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1761
> > > +                if (!transaction.inProgress())
> > 
> > Ditto.
> > 
> > > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1783
> > > +            if (!transaction.inProgress())
> > 
> > Ditto.
> > 
> > > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:2248
> > >      auto result = ensureResourceStatisticsForRegistrableDomain(subresourceDomain);
> > 
> > I wonder if we want to pass transaction to
> > ensureResourceStatisticsForRegistrableDomain and have a check in there for a
> > transaction in progress in the function to match your existing pattern?
> 
> ensureResourceStatisticsForRegistrableDomain() only does a single write
> statement so it doesn't technically require a transaction. Also, some of the
> call sites are not writers but readers so they don't currently have a
> transaction (even after my patch).
> 
> > 
> > > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:3172
> > >      updateOperatingDatesParameters();
> > 
> > updateOperatingDatesParameters() also interacts with the database, maybe we
> > should move transaction.commit(); below that and pass transaction into the
> > updateOperatingDatesParameters() function.
> > 
> > > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:3229
> > >      updateOperatingDatesParameters();
> > 
> > Ditto.
> 
> Ok, I'll double check.

updateOperatingDatesParameters() only seems to be doing SELECT statements, not write statements. That said, it probably doesn't hurt to include them in the transaction.
Comment 11 Chris Dumez 2021-06-15 16:12:20 PDT
(In reply to katherine_cheney from comment #8)
> Comment on attachment 431475 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=431475&action=review
> 
> I think this is a good idea. I'm curious about the pattern. Do we want all
> functions that interact with the database to use transactions, or is it
> as-needed for now to improve performance?
> 
> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:636
> > +        if (!transaction.inProgress())
> 
> It doesn't look like the code does anything if you call begin() on a
> transaction already in progress, you could probably remove this if you
> wanted.
> 
> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1039
> > +String ResourceLoadStatisticsDatabaseStore::ensureAndMakeDomainList(const SQLiteTransaction&, const HashSet<RegistrableDomain>& domainList)
> 
> Did you mean to have ASSERT(m_database.transactionInProgress()); here?
> Otherwise I am not sure why we should include SQLiteTransaction in this
> function, unless it is to avoid unused parameter errors in release builds,
> but in that case I think it can be done in insertDomainRelationshipList.
> 
> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1158
> >          updateDataRecordsRemoved(other.registrableDomain, other.dataRecordsRemoved);
> 
> Is there a pattern for which functions ASSERT transactions in progress? I
> notice a few functions in here that don't have them, like
> updateDataRecordsRemoved and setGrandfathered.

updateDataRecordsRemoved has a single write statement, therefore it doesn't require a transaction and I don't think we should add a transaction parameter. While it wouldn't hurt currently since there is a single call site and that call site does have a transaction, you could imagine another call site that calls it without a transaction and there would be no perf issue since there is a single write statement.

setGrandfathered() definitely needs a transaction parameter though, I'll add it.

> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1457
> > +                transaction.begin();
> 
> Ditto, I think you can just call transaction.begin() with no consequences if
> it's already in progress.
> 
> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1761
> > +                if (!transaction.inProgress())
> 
> Ditto.
> 
> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1783
> > +            if (!transaction.inProgress())
> 
> Ditto.
> 
> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:2248
> >      auto result = ensureResourceStatisticsForRegistrableDomain(subresourceDomain);
> 
> I wonder if we want to pass transaction to
> ensureResourceStatisticsForRegistrableDomain and have a check in there for a
> transaction in progress in the function to match your existing pattern?
> 
> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:3172
> >      updateOperatingDatesParameters();
> 
> updateOperatingDatesParameters() also interacts with the database, maybe we
> should move transaction.commit(); below that and pass transaction into the
> updateOperatingDatesParameters() function.
> 
> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:3229
> >      updateOperatingDatesParameters();
> 
> Ditto.
Comment 12 Chris Dumez 2021-06-15 16:16:29 PDT
Created attachment 431491 [details]
Patch
Comment 13 Chris Dumez 2021-06-15 16:19:02 PDT
I still check inProgress() before calling begin() in a few cases where we don't unconditionally call begin(). I feel it is clearer as I commented earlier but if you feel strongly about it, let me know and I'll make the change.
Comment 14 Kate Cheney 2021-06-15 16:22:05 PDT
Comment on attachment 431491 [details]
Patch

I don't feel strongly, I think your logic makes sense. Thanks for explaining.
Comment 15 EWS 2021-06-15 17:56:11 PDT
Committed r278916 (238847@main): <https://commits.webkit.org/238847@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 431491 [details].
Comment 16 Radar WebKit Bug Importer 2021-06-15 17:57:18 PDT
<rdar://problem/79372370>