Profile Performance of LocalStorage-based and client-side cookie-based User Preference Storage
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	NBaca-WMF
	Jan 24 2023, 3:49 PM

Description

As part of our fix for https://phabricator.wikimedia.org/T321498 , we introduced a LocalStorage-based user preferences persistence mechanism.

As noted in prior discussions, until we discuss this further, this is meant to be an immediate-term fix for the needs associated with this single preference setting. As such, our goal is to understand and mitigate risk associated with this fix until we have a more sustainable longer-term solution.

With this in mind, we should profile the performance of the relevant patchset for this fix (see https://gerrit.wikimedia.org/r/c/mediawiki/core/+/881728 ) , with preferences enabled and disabled, to better understand the performance implications of this current fix.

Coming out of this analysis, we should be able to say "this is not so bad because metrics XYZ are impacted in ways ABC", or "this is not acceptable given impact DEF".

Acceptance Criteria

Per the performance team's recommendation at https://gerrit.wikimedia.org/r/c/mediawiki/core/+/882758/2#message-c05fce98d86aeced6252440b395828fd982caf91:

Profile changes locally with a 6x CPU throttle to get a rough idea of impact
Enable feature flag for small wikis (e.g. mediawiki.org, cawiki) and look at impact of synthetic tests
Deploy everywhere and measure impact of navigation timing dashboard

Related Objects

Mentioned In: T336527: Run a synthetic test for client side preferences
Mentioned Here: T285203: On-demand performance testing
T321498: Make sure limited width toggle is persistent for anonymous users

Event Timeline

NBaca-WMF created this task.Jan 24 2023, 3:49 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 24 2023, 3:49 PM

• nray updated the task description. (Show Details)Jan 24 2023, 4:03 PM

• nray moved this task from Incoming to Doing on the Web-Team FY2022-23 Q3 Sprint 1 board.Jan 24 2023, 4:49 PM

ovasileva triaged this task as High priority.Jan 24 2023, 5:42 PM

• nray updated the task description. (Show Details)Jan 24 2023, 6:05 PM

• nray updated the task description. (Show Details)Jan 24 2023, 6:10 PM

Profiling the local storage strategy

I profiled the local storage (patches 881728, 845667) on my local machine (MacBook Pro M1) using a 6x CPU throttle in incognito mode to get a very rough idea of the impact of using the local storage strategy.

I used performance.mark before the call to if ( $config->get( MainConfigNames::SkinClientPreferences ) && $isAnon ) so that I could identify the inline script task more easily. Then, I profiled a page load setting $wgSkinClientPreferences = false compared to a page load setting $wgSkinClientPreferences = true;. I then identified the task in the main thread responsible for executing the inline script.

$wgSkinClientPreferences = false

Total task time: 9.83 ms . No forced synchronous layouts present in task.

$wgSkinClientPreferences = true

Total task time: 11.53 ms . No forced synchronous layouts present in task.

Differences observed

Although the local storage feature was about 1.7ms longer, it is still well under the 50ms cut-off to be considered a long task.

It's important to note that this was only measured on my (powerful) machine with a heavy throttle to simulate lower-end devices. It's only meant to serve as a very rough approximation. I expect these results to look different when run on other devices.

However, I did not identify anything in the profile that blocks the next step of this process - "Enable feature flag for small wikis (e.g. mediawiki.org, cawiki) and look at impact of synthetic tests".

Profiling the client-side cookie strategy

I profiled the local storage (patches 881728, 845667) on my local machine (MacBook Pro M1) using a 6x CPU throttle in incognito mode to get a very rough idea of the impact of using the client-side cookie strategy.

I used performance.mark before the conditional if ( $config->get( MainConfigNames::ResourceLoaderClientPreferences ) && $isAnon ) { so that I could identify the inline script task more easily. Then, I profiled a page load setting $wgSkinClientPreferences = false compared to a page load setting $wgSkinClientPreferences = true;. I then identified the task in the main thread responsible for executing the inline script.

$wgResourceLoaderClientPreferences = false

Total task time: 8.67 ms . No forced synchronous layouts present in task.

$wgResourceLoaderClientPreferences = true

Total task time: 10.56 ms . No forced synchronous layouts present in task.

Differences observed

Although the cookie strategy's inline script task took about 1.89ms longer, it is still well under the 50ms cut-off to be considered a long task.

Thanks @nray ! Leaving open for a bit for any follow up discusson.

Using sitespeed.io to measure client-side cookie strategy

I also thought it would be interesting to see if sitespeedio detected any major regression with the client-side cookie strategy. I ran the docker run --rm -v "$(pwd):/sitespeed.io" sitespeedio/sitespeed.io:26.1.0 http://host.docker.internal:8080/wiki/Test command with the feature flag off/on and compared the differences:

$wgResourceLoaderClientPreferences = false

Results of report: https://before-profile-cookie.netlify.app/

before-site-speed-io.png (1×3 px, 483 KB)

$wgResourceLoaderClientPreferences = true

Results of report: https://after-profile-cookie.netlify.app/

after-site-speed-io.png (1×3 px, 515 KB)

Differences observed

First paint, first contentful paint, fully loaded, speed index, and HTML transfer size were all fairly similar. I didn't identify anything that would block the next step of the process - "Enable feature flag for small wikis (e.g. mediawiki.org, cawiki) and look at impact of synthetic tests".

Hi @nray cool, thanks for testing it out! There's a couple of things I'm thinking about: perfect that you looked in devtools and use the cpu slowdown (the cpu slowdown works as it should vs the network slowdown in devtools ín Chrome sucks :).

For the 50 ms limit on when a task is a long task: my thinking is that when you have a 50 ms slowdown scrolling/clicking can be janky depending on when it happens and the 50 ms depends on what CPU our device has, so its a good way to try slowdown the CPU as you did. We measure the CPU speed from some of our users and I've been using those metrics to try calibrate our test phones to match our 75/95p users. However I haven't done the same yet on desktop, but will do when we move the tests to bare metal. When we have that I have pretty good confidence our numbers will match our users.

So ... what I wanted to say is that I think in our new case loading the preferences before start rendering the 50 ms limit isn't a good threshold, since every X ms here will postpone rendering by at least X. It's complicated with how the page render, a small delay early on, can make bigger difference later on when the page render. It's a complicated puzzle. For example I've seen that we have had test where we test the exact same thing and the difference in rendering can be 0.5-1 second and it depends on when the browser chooses to parse and execute CSS/JS. That's why I think its important to look at the full picture (like waterfalls and others metrics), lets do that together the next time we something we wanna try out.

The way we've been doing that is to push the change to our production servers in static directory where we create version of what we want to try and then we point our measuring tools to it (like sitespeed.io). That way we are measuring it in more realistic scenario (our servers are serving the content) and we can let sitespeed.io run a standalone server so we can minimise the noise. That's the best way to do it today, but we want to make it easier so you can test yourself in a stable environment in a really easy way and T285203 is a way forward for that.

Thank you for the info @Peter!

I also have a 2 GHz / 2 GB RAM Xiaomi Redmi 9A phone (costs under $100) that I sometimes like to use for profiling to get a better idea of how low-powered devices respond. I was curious how this compared to my profile using my MacBook Pro M1 with a 6x CPU throttle and again found minor differences in the script execution - 5.37ms vs. 6.03ms with the feature flag off vs. on, respectively (and using the desktop view to see vector-2022 on my phone).

before-mobile-script.png (1×3 px, 413 KB)

after-mobile-script.png (1×3 px, 378 KB)

Also, thank you for mentioning the 50ms threshold. I agree that this case is unique because of how far upstream it occurs, and that probably isn't the best threshold to use in this situation. Still, I've only seen minor differences in each profile I've looked at which makes me think that it's reasonable to enable the feature on group 0 wikis and monitor the synthetic tests/RUM metrics as a next step.

• nray removed • nray as the assignee of this task.Jan 31 2023, 5:35 PM

• nray subscribed.

ovasileva assigned this task to Jdrewniak.Jan 31 2023, 5:36 PM

Jdlrobson added a project: Web-Team-Backlog.Feb 3 2023, 4:57 PM

Jdlrobson moved this task from Incoming to Q3 Sprint 2 (non milestone) on the Web-Team-Backlog board.Feb 3 2023, 4:59 PM

Jdlrobson moved this task from Q3 Sprint 2 (non milestone) to Sprint 1 (non-milestone) on the Web-Team-Backlog board.

Jdrewniak closed this task as Resolved.Feb 6 2023, 6:10 PM

Mooeypoo subscribed.Feb 6 2023, 11:00 PM

Jdlrobson mentioned this in T336527: Run a synthetic test for client side preferences.Jul 18 2023, 12:15 AM

	F36487129: after-task.png
	Jan 24 2023, 11:39 PM

	F36487126: before-task.png
	Jan 24 2023, 11:39 PM

	F36486373: profile-after.png
	Jan 24 2023, 8:13 PM

	F36486371: profile-before.png
	Jan 24 2023, 8:13 PM

Profile Performance of LocalStorage-based and client-side cookie-based User Preference StorageClosed, ResolvedPublicActions

Description

Acceptance Criteria

Related Objects

Event Timeline

Profiling the local storage strategy

$wgSkinClientPreferences = false

$wgSkinClientPreferences = true

Differences observed

Profiling the client-side cookie strategy

$wgResourceLoaderClientPreferences = false

$wgResourceLoaderClientPreferences = true

Differences observed

Using sitespeed.io to measure client-side cookie strategy

$wgResourceLoaderClientPreferences = false

$wgResourceLoaderClientPreferences = true

Differences observed

Profile Performance of LocalStorage-based and client-side cookie-based User Preference Storage
Closed, ResolvedPublic
Actions