Bug 208612

Summary: Resource Load Statistics: Keep a count of resource loads per day
Product: WebKit Reporter: mmokary
Component: WebKit Misc.Assignee: Nobody <webkit-unassigned>
Status: NEW ---    
Severity: Normal CC: bfulgham, cdumez, ews-watchlist, japhet, katherine_cheney, webkit-bug-importer, wilander
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch
none
Patch mmokary: review?

Description mmokary 2020-03-04 15:54:17 PST
Keep a count of the number of third party resource loads occurring per day.
Comment 1 Radar WebKit Bug Importer 2020-03-04 15:55:09 PST
<rdar://problem/60060905>
Comment 2 mmokary 2020-03-04 16:32:48 PST
Created attachment 392506 [details]
Patch
Comment 3 Kate Cheney 2020-03-04 17:16:32 PST
Comment on attachment 392506 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=392506&action=review

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:86
> +constexpr auto insertResourceLoadCountRowQuery = "INSERT OR IGNORE INTO ResourceLoadCount VALUES(?, 0)"_s;

INSERT OR IGNORE calls might require creating a unique index. Maybe not though because ResourceLoadCount has a primary key. Did you test that repeat attempts don't create a new row?

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:356
> +            if (strncmp(table, "ResourceLoadCount", strlen(table))) {

You can just use == here.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:359
> +            } else if (!createResourceLoadCountTable()) {

Hmm, this seems strange. createResourceLoadCountTable() will be called for every other table iteration besides ResourceLoadCount. It should probably only be called once.

I think this could be re-written in a way to  make it easier to add new tables in the future. Maybe maintain a list of missing tables, and add them at the end?

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:602
> +

You can use resetStatement() here if you want. One less line of code.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:620
> +    ASSERT_UNUSED(resetResult, resetResult == SQLITE_OK);

ditto

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:645
> +

ditto

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1346
> +        buildList(WTF::IteratorRange<Vector<unsigned>::iterator>(julianDaysInRange.begin(), julianDaysInRange.end())), ") ORDER BY day"));

I think theres a static buildList function already in the class you can use.

> Source/WebKit/Shared/WebCoreArgumentCoders.cpp:2862
>      

Maybe I'm out of the loop, but what does "Transient stats" mean here?
Comment 4 Chris Dumez 2020-03-04 17:23:54 PST
Comment on attachment 392506 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=392506&action=review

> Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.cpp:1307
> +    postTask([this, numberOfDays = WTFMove(numberOfDays), completionHandler = WTFMove(completionHandler)]() mutable {

No need to WTFMove() an unsigned

> Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.cpp:1317
> +            postTaskReply([resourceLoadCountPerDay = WTFMove(resourceLoadCountPerDay), completionHandler = WTFMove(completionHandler)]() mutable {

ditto

> Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.cpp:1318
> +                completionHandler(WTFMove(resourceLoadCountPerDay));

ditto

> Source/WebKit/Shared/WebCoreArgumentCoders.cpp:2756
> +    // Transient stats

WebKit comments need to end with a period.

> Source/WebKit/Shared/WebCoreArgumentCoders.cpp:2859
> +    // Transient stats

WebKit comments need to end with a period.

> Source/WebKit/UIProcess/API/Cocoa/WKWebsiteDataStore.mm:552
> +    completionHandler();

Dont you need to pass a parameter here?

> Source/WebKit/UIProcess/Network/NetworkProcessProxy.cpp:887
> +    if (!canSendMessage()) {

Should not be needed.

> Source/WebKit/UIProcess/WebsiteData/WebsiteDataStore.cpp:1526
> +            process->getResourceLoadCountPerDayForPastDays(m_sessionID, numberOfDays, WTFMove(completionHandler));

use after move of completionHandler if there is more than one process pool with a network process.
Comment 5 mmokary 2020-03-04 17:56:57 PST
(In reply to katherine_cheney from comment #3)
> Comment on attachment 392506 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=392506&action=review
> 
> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:86
> > +constexpr auto insertResourceLoadCountRowQuery = "INSERT OR IGNORE INTO ResourceLoadCount VALUES(?, 0)"_s;
> 
> INSERT OR IGNORE calls might require creating a unique index. Maybe not
> though because ResourceLoadCount has a primary key. Did you test that repeat
> attempts don't create a new row?

The primary key acts as a unique index in this (ON CONFLICT IGNORE) case, so this doesn't create new rows.

> 
> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:356
> > +            if (strncmp(table, "ResourceLoadCount", strlen(table))) {
> 
> You can just use == here.

table is a char*[], it doesn't like == 

> 
> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:359
> > +            } else if (!createResourceLoadCountTable()) {
> 
> Hmm, this seems strange. createResourceLoadCountTable() will be called for
> every other table iteration besides ResourceLoadCount. It should probably
> only be called once.
> 
> I think this could be re-written in a way to  make it easier to add new
> tables in the future. Maybe maintain a list of missing tables, and add them
> at the end?
> 

Since the strncmp is effectively a != (rather than ==) this will only be reached when table is "ResourceLoadCount". I agree this doesn't lend well to adding new similar tables in the future, and reads poorly.

> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:602
> > +
> 
> You can use resetStatement() here if you want. One less line of code.
> 
> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:620
> > +    ASSERT_UNUSED(resetResult, resetResult == SQLITE_OK);
> 
> ditto
> 
> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:645
> > +
> 
> ditto
> 
> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1346
> > +        buildList(WTF::IteratorRange<Vector<unsigned>::iterator>(julianDaysInRange.begin(), julianDaysInRange.end())), ") ORDER BY day"));
> 
> I think theres a static buildList function already in the class you can use.
> 

This uses the static buildList function

> > Source/WebKit/Shared/WebCoreArgumentCoders.cpp:2862
> >      
> 
> Maybe I'm out of the loop, but what does "Transient stats" mean here?

It was just a name I came up with, since the field frequently restarts from 0 and isn't persisted. I'll rename it to something more clear.
Comment 6 mmokary 2020-03-04 19:30:07 PST
(In reply to Chris Dumez from comment #4)
> Comment on attachment 392506 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=392506&action=review
> 
> > Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.cpp:1307
> > +    postTask([this, numberOfDays = WTFMove(numberOfDays), completionHandler = WTFMove(completionHandler)]() mutable {
> 
> No need to WTFMove() an unsigned
> 
> > Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.cpp:1317
> > +            postTaskReply([resourceLoadCountPerDay = WTFMove(resourceLoadCountPerDay), completionHandler = WTFMove(completionHandler)]() mutable {
> 
> ditto

In this case it's a Vector<unsigned>

> 
> > Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.cpp:1318
> > +                completionHandler(WTFMove(resourceLoadCountPerDay));
> 
> ditto

Same here, Vector<unsigned>

> 
> > Source/WebKit/Shared/WebCoreArgumentCoders.cpp:2756
> > +    // Transient stats
> 
> WebKit comments need to end with a period.
> 
> > Source/WebKit/Shared/WebCoreArgumentCoders.cpp:2859
> > +    // Transient stats
> 
> WebKit comments need to end with a period.
> 
> > Source/WebKit/UIProcess/API/Cocoa/WKWebsiteDataStore.mm:552
> > +    completionHandler();
> 
> Dont you need to pass a parameter here?
> 
> > Source/WebKit/UIProcess/Network/NetworkProcessProxy.cpp:887
> > +    if (!canSendMessage()) {
> 
> Should not be needed.

All adjacent methods do this. If it's not needed it should be a separate task to update them all IMO.

> 
> > Source/WebKit/UIProcess/WebsiteData/WebsiteDataStore.cpp:1526
> > +            process->getResourceLoadCountPerDayForPastDays(m_sessionID, numberOfDays, WTFMove(completionHandler));
> 
> use after move of completionHandler if there is more than one process pool
> with a network process.

Changing this to RELEASE_ASSERT(processPool().size() == 1)
Comment 7 mmokary 2020-03-04 19:33:15 PST
Created attachment 392530 [details]
Patch
Comment 8 Kate Cheney 2020-03-05 10:45:28 PST
(In reply to mmokary from comment #5)
> (In reply to katherine_cheney from comment #3)
> > Comment on attachment 392506 [details]
> > Patch
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=392506&action=review
> > 
> > > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:86
> > > +constexpr auto insertResourceLoadCountRowQuery = "INSERT OR IGNORE INTO ResourceLoadCount VALUES(?, 0)"_s;
> > 
> > INSERT OR IGNORE calls might require creating a unique index. Maybe not
> > though because ResourceLoadCount has a primary key. Did you test that repeat
> > attempts don't create a new row?
> 
> The primary key acts as a unique index in this (ON CONFLICT IGNORE) case, so
> this doesn't create new rows.
> 

Great! I assumed so because of the test case, but wanted to double check :) 

> > 
> > > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:356
> > > +            if (strncmp(table, "ResourceLoadCount", strlen(table))) {
> > 
> > You can just use == here.
> 
> table is a char*[], it doesn't like == 
> 
> > 

You can probably use String(table)

> > > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:359
> > > +            } else if (!createResourceLoadCountTable()) {
> > 
> > Hmm, this seems strange. createResourceLoadCountTable() will be called for
> > every other table iteration besides ResourceLoadCount. It should probably
> > only be called once.
> > 
> > I think this could be re-written in a way to  make it easier to add new
> > tables in the future. Maybe maintain a list of missing tables, and add them
> > at the end?
> > 
> 
> Since the strncmp is effectively a != (rather than ==) this will only be
> reached when table is "ResourceLoadCount". I agree this doesn't lend well to
> adding new similar tables in the future, and reads poorly.
> 

Ah okay, this makes more sense now. I do think the strncmp makes this code harder to understand though. IMO I think converting the char* to a WTFString then doing a if (String(table) != "ResourceLoadCount"_s) would read better.

Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1346
> > > +        buildList(WTF::IteratorRange<Vector<unsigned>::iterator>(julianDaysInRange.begin(), julianDaysInRange.end())), ") ORDER BY day"));
> > 
> > I think theres a static buildList function already in the class you can use.
> > 
> 
> This uses the static buildList function

Awesome! Not sure what I thought I saw the first time around...
Comment 9 mmokary 2020-03-05 14:30:09 PST
Created attachment 392627 [details]
Patch