[go: up one dir, main page]

Page MenuHomePhabricator

[SPIKE] How does edit size vary by platform and region?
Closed, ResolvedPublic

Description

This task involves the work of learning how the size of edits people publish to pages in the main namespace vary by:

  • Platform (desktop / mobile)
  • Editing interface (VE, wikitext)
  • Experience level (new, unregistered, junior)

Knowing the above will help the Editing Team develop a stronger sensibility for the kinds of edits the multi-Edit Check experience will need to accommodate (T347530) and subsequently, the approach we will move forward with implementing.

Question(s) to be answered

  • 1. How does proportion of edits by size people publish in the main namespace vary by platform, editing interface, and experience level?

Requirements

Graph(s) showing how the size of edits (measured in bites) people publish to pages in the main namespace vary along the following dimensions:

  • Platform (desktop / mobile)
  • Editing interface (VE, wikitext)
  • Experience level (new, unregistered, junior)
  • Region ("SSA" and "all regions")

If there is time, it would be wonderful to be able to explore this data independently in a Superset graph/visualization.


References

  • Perhaps what we're trying to learn through this ticket could benefit from prior work @Isaac did in T334760#8782740 and T293465 more broadly

Event Timeline

ppelberg created this task.
ppelberg moved this task from Backlog to Analytics on the Editing-team (Tracking) board.
MNeisler triaged this task as Medium priority.May 20 2024, 2:48 PM
MNeisler edited projects, added Product-Analytics (Kanban); removed Product-Analytics.
MNeisler moved this task from Doing to Needs Review on the Product-Analytics (Kanban) board.

I've created a dashboard to explore how edit sizes vary by the specified dimensions. The dashboard includes data on median edit size and the proportion of edits by size.

Some initial data insights:

  • Newcomers tend to make more large-size edits on main namepaces.
  • Contributors make both large and small edits on all interfaces and platforms. Contributors are slightly more likely to make size large size edits on desktop and using VE.
  • Unregistered users are more likely to make small-size edits.
  • Overall findings align with the results reported T334760#8782740 based on the Edit Type research.

A few notes/clarifications about the dataset:

  • I limited the dataset to main namespace edits on Wikipedia projects only. Edit sizes on other wiki projects such as Wikidata can be much different so I recommend looking at edit sizes on those projects separately if needed.
  • Data comes from mediawiki_history, which does not have a tag to identify desktop edits. As a result, these edits are categorized as "Desktop and other" where "Other" includes edits made using various API tools/gadgets (Such as huggle, AWB and sometimes ad hoc scipts). However, the majority of these edits can be expected to be desktop.
  • I reviewed the proportion of edits by the following edit size categories based on the distribution of edit size data:
    • small: absolute edit size of 10 or fewer bytes
    • medium: absolute edit size between 11 and 99 bytes
    • large: absolute edit size of 100 or more bytes
  • I used median instead of mean for this dataset to account for some very large outliers in edit size that inflated the average.

@ppelberg Let me know if you have any questions.

@MNeisler this looks great. No questions beyond the little bit of further investigation described below...

Per what we discussed offline today we'd value knowing of the edits newcomers publish >100 bytes, how those edits are distributed by byte size. //E.g. we'd like to be able to conclude something to the effect of, "Of the edits newcomers make that >100 bytes, 88% of said edits are between 250 and 500 bytes"

NOTE: we'd value this refinement to also be split by platform, wiki, region, and interface.

we'd value knowing of the edits newcomers publish >100 bytes, how those edits are distributed by byte size.

@ppelberg I've updated the dashboard to include a tab ("Edits by newcomers (over 100 bytes)" with charts showing the distribution of edits by newcomers that are over 100 bytes in size.

Filters on the dashboard can be used to view splits by platform, wiki, region and interface as needed.

Some initial insights:

Of all edits posted by newcomers that are > 100 bytes:

  • The median edit size is 423 bytes.
  • A little over half (55%) of these edits are under 500 bytes.
    • 26% of edits are between 100 to 200 bytes
    • 29% of edits are between 251 to 500 bytes.
  • Only 12% of edits are over 3000 bytes.
  • The larger the edit size the more likely it was made on desktop (only 4% of edits over 3000 bytes were made on mobile web compared to 14% on desktop).
  • Trends vary on a per-wiki basis. On English WIkipedia, edits sizes by newcomers are smaller compared to overall trends. 60% of edits are between 100 and 500 bytes.

we'd value knowing of the edits newcomers publish >100 bytes, how those edits are distributed by byte size.

@ppelberg I've updated the dashboard to include a tab ("Edits by newcomers (over 100 bytes)" with charts showing the distribution of edits by newcomers that are over 100 bytes in size.

This is *precisely* the information we were seeking and the format we were seeking to access it using. Wonderful work, @MNeisler!