Suppressed username shown on File pages
Open, MediumPublic6 Estimated Story Points
Actions

Assigned To

None

Authored By

	• csteipp
	Feb 20 2015, 11:23 PM

Description

When a user is blocked with username suppression, the username is suppressed for old revisions of a file, but the current version has the (suppressed) username shown.

To reproduce,

Upload an image as a user
Upload a second version of the image
Login as an admin and block the uploading, selecting "Hide username from edits and lists"
As a logged out user, view the File page. The username is still visible on the current revision, but not for old revisions.

Example of affected file with a single revision: Schachboxen1.jpg
Example of affected file with multiple revisions: Toolse_linnuse_varemed.jpg

Patches

0001-SECURITY-Add-img_deleted-column-b.patch9 KBDownload
- add image.img_deleted, populate it from ipblocks, update oldimage.oi_deleted/filearchive.fa_deleted based on ipblocks to account for past image -> oldimage/filearchive transitions
0002-SECURITY-Support-revdel-flags-on-current-versions-of.patch14 KBDownload
- support revdel flags on current versions of images
0003-SECURITY-Invalidate-cached-data-on-user-suppression.patch3 KBDownload
- invalidate image cache when needed
0003-SECURITY-Launch-a-separate-user-suppression-job-for-each-wiki.patch4 KBDownload
- trigger cache invalidation from CentralAuth global locks
0001-Make-image-table-private.patch1 KBDownload
- make image table dump private (patch made against ariel branch of operations/dumps
0001-Hide-suppressed-usernames-in-image-table.patch2 KBDownload
- hide username in Labs replicas (patch made against operations/software)

One-time script files for WMF cluster update:

T90300-add-img_deleted.sql:

ALTER TABLE image ADD img_deleted tinyint unsigned NOT NULL default 0;

T90300-populate-img_deleted.sql:

UPDATE image JOIN ipblocks ON img_user_text = ipb_address
SET img_deleted = img_deleted | 12
WHERE ipb_deleted;

UPDATE oldimage JOIN ipblocks ON oi_user_text = ipb_address
SET oi_deleted = oi_deleted | 12
WHERE ipb_deleted;

UPDATE filearchive JOIN ipblocks ON fa_user_text = ipb_address
SET fa_deleted = fa_deleted | 12
WHERE ipb_deleted;

WMF cluster update plan

run foreachdb sql.php /home/tgr/T90300-add-img_deleted.sql
apply all patches
run foreachdb sql.php /home/tgr/T90300-populate-img_deleted.sql
run script via foreachwiki eval.php to invalidate cache (if we care enough):

$dbr = wfGetDB( DB_SLAVE ); $res = $dbr->select( 'image', LocalFile::selectFields(), array( $dbr->bitAnd( 'img_deleted', 12 ) . ' = 12' ), 'T90300' ); foreach ( $res as $row ) { RepoGroup::singleton()->getLocalRepo()->newFileFromRow( $row )->invalidateCache(); }

Notices

removes public image table dumps (image table will still be available via Tool Labs)

Details

	Subject	Repo	Branch	Lines +/-
	Add img_deleted column	mediawiki/core	master	+46 -2

Customize query in gerrit

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Open		None	T90300 Suppressed username shown on File pages
		Stalled		None	T257222 Add img_deleted column

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Tgr updated the task description. (Show Details)Mar 16 2015, 8:58 PM

The patch to maintain-replicas has my +1. Please poke me when you want it applied or - alternately - anyone can do so from root on labstore1001 (the operations/software tree is cloned into root's home):

# ./maintain-replicas.pl -u image

(-u image tells the tool to forcibly recreate the view in all databases where it exists).

• Gilles edited projects, added Multimedia-Sprint-2015-03-18; removed Multimedia-Sprint-2015-03-11.Mar 18 2015, 4:14 PM

• Gilles moved this task from Backlog to Needs code review on the Multimedia-Sprint-2015-03-18 board.

• csteipp added a parent task: Restricted Task.Mar 19 2015, 9:28 PM

• Gilles edited projects, added Multimedia-Sprint-2015-03-25; removed Multimedia-Sprint-2015-03-18.Mar 25 2015, 3:44 PM

• Gilles moved this task from Backlog to Needs code review on the Multimedia-Sprint-2015-03-25 board.Mar 25 2015, 3:48 PM

@Tgr, sorry for the slow review!

0001-SECURITY-Add-img_deleted-column.patch8 KBDownload

didn't account for table prefixes for a few of the tables-- I made a new version, and rebased it on master.

0001-SECURITY-Add-img_deleted-column-b.patch9 KBDownload

In testing this, I'm seeing something odd with the File: page and caching. When the same user uploads multiple versions of a file, then I block + hide that user, on the File: page, the old versions of the file immediately have the correct "(username removed)" for the username, but the current revision still has the username visible. I can sometimes add purge=1 to the url to get it to correctly hide the username, but sometimes I have to flush memcache.

So it seems like the archived versions of the file are flushing the File object out of cache, but the current version isn't. Any idea how to fix that?

• csteipp reassigned this task from • csteipp to Tgr.May 11 2015, 11:47 PM

• csteipp triaged this task as Medium priority.

• csteipp added a project: Security-Team.

• csteipp moved this task from Incoming to In Progress on the Security-Team board.

I don't think old versions are cached at all (OldLocalFile::getCacheKey() should disable that). I don't see an easy fix - RevisionDeleteUser uses mass DB updates to set the flags, and there is no equivalent for the cache, so the only thing I can think of is a job that iterates through all images of the user and invalidates them. (Which means that for global suppression, a job will iterate through the wikis and fire off a thousand other jobs which iterate through the images... not sure if that would cause any complications. If it's problematic, we could check if the DB update touched any rows and only fire the job when needed.)

@aaron, any ideas on how to deal with this caching issue?

Why not just have RevisionDeleteUser iterate over the image rows to be changed and call invalidateCache()? That should be able to handle a fair number of edits (and it's not likely such users will have many files). AFAIK, the doLocalSuppression call can already run in jobs. If we just removed $wgCentralAuthWikisPerSuppressJob and always used the queue for simplicity, and pushed the jobs to the local wikis rather than the central queue, the cache purges would work. CentralAuthSuppressUserJob looks like it would work on local wikis.

I made https://gerrit.wikimedia.org/r/212579 to speed up such purges a bit.

Tgr updated the task description. (Show Details)Jun 2 2015, 6:54 PM

Did that, don't really like the result:

0003-SECURITY-Invalidate-cached-data-on-user-suppression.patch1 KBDownload

0003-SECURITY-Launch-a-separate-user-suppression-job-for-each-wiki.patch4 KBDownload

@aaron, couldn't we expose the shared repo cache key to other wikis instead?

Tgr added projects: MediaWiki-Revision-deletion, Datasets-General-or-Unknown, Toolforge, MediaWiki-Core-Snapshots, Notice.Jun 3 2015, 1:06 AM

Fixed a stupid mistake in the Labs replication patch.

@Tgr, I applied

0003-SECURITY-Invalidate-cached-data-on-user-suppression.patch1 KBDownload

, which doesn't look like it fixed the issue for local suppression blocks. Is there a local patch similar to the centralauth patch, so that the cache is invalidated? Wasn't sure if I missed something or if you're still working on it.

How do you trigger a local suppression block? RevisionDeleteUser::invalidateCache() needs to be called, and right now that's the responsibility of the code that calls the other RevisionDeleteUser methods (and I only did it for CentralAuth). Maybe it should be called by those methods instead; I didn't want to do that because those work when called cross-wiki and invalidateCache does not and that would have been confusing.

Just do a normal block on a username, and there's an option to also
suppress the username, if you have the right (iirc).

Tgr updated the task description. (Show Details)Jun 25 2015, 5:01 AM

Restricted Application added a project: Cloud-Services. · View Herald TranscriptJun 25 2015, 5:01 AM

@csteipp: updated the patch (it's in the task summary). Looks like undoing a suppress block via Special:Block is currently broken, the patch fixes that as well.

In T90300#1332550, @Tgr wrote:

Did that, don't really like the result:

0003-SECURITY-Invalidate-cached-data-on-user-suppression.patch1 KBDownload

0003-SECURITY-Launch-a-separate-user-suppression-job-for-each-wiki.patch4 KBDownload

@aaron, couldn't we expose the shared repo cache key to other wikis instead?

Yes, though I think we should be careful about cross-wiki code and the existing pattern seems a bit like a mistake. It's already there though and building on it is not intolerable though. In any case it should either fully work crosswiki or not pretend to.

There was a stray comma in the sql patchfile.

@Tgr, thanks for the new patches, I confirmed the issue is fixed now!

I'm a little worried about the move to only using jobs in centralauth-- @aaron, do those run in their own queue so we can be sure that they're going to get run soon after the user is hidden?

If so, I'm happy to go ahead and deploy these.

In T90300#1445764, @csteipp wrote:

@Tgr, thanks for the new patches, I confirmed the issue is fixed now!

I'm a little worried about the move to only using jobs in centralauth-- @aaron, do those run in their own queue so we can be sure that they're going to get run soon after the user is hidden?

If so, I'm happy to go ahead and deploy these.

They don't, just the regular high-priority queue. It might makes sense to leave it that way unless it's too slow. If we have slow jobs in the high priority queue that back up others, then they should be moved the low priority queue rather than making new queue loops, generally speaking.

@jcrespo / @Springle,

This all looks good from the security side, I'd like to start getting the fixes rolled out. Tgr outlined the task description, but basically,

WMF cluster update plan

run foreachdb sql.php /home/tgr/T90300-add-img_deleted.sql

apply all patches

run foreachdb sql.php /home/tgr/T90300-populate-img_deleted.sql

run script via foreachwiki eval.php to invalidate cache (if we care enough):

Can we start working on #1?

Unless @Springle is against it, I am ok with the intended work, I would be against the method, specially for commons which has 36 million rows on image and that would disable file uploads for over 1 hour, plus create a 1 hour lag on the slaves. For other wikis with no images or a small number, it is ok (<1000 rows).

I would recommend using pt-online-schema change for the larger tables (this table has a PRIMARY KEY). We can take care of this, with the usual questions (where and when?).

For the updates, make sure they are done in <1 sec steps, perform several of them otherwise.

Re: query optimization, there is a big difference between the IN () and the JOIN version (apparently MySQL is really stupid about handling uncorrelated IN subqueries), so using joins and the (both faster and more correct) username-based joins:

This is fixed on 5.6/10. We recently dumped 5.5 on the slaves, and will do on the masters soon.

• csteipp moved this task from In Progress to Waiting on the Security-Team board.Jul 31 2015, 6:00 PM

In T90300#1492489, @jcrespo wrote:

Unless @Springle is against it, I am ok with the intended work, I would be against the method, specially for commons which has 36 million rows on image and that would disable file uploads for over 1 hour, plus create a 1 hour lag on the slaves. For other wikis with no images or a small number, it is ok (<1000 rows).

I would recommend using pt-online-schema change for the larger tables (this table has a PRIMARY KEY). We can take care of this, with the usual questions (where and when?).

@jcrespo, I'd be happy for you or Sean to do this. It would need to be applied to all wikis, whenever you're able to schedule it.

For the updates, make sure they are done in <1 sec steps, perform several of them otherwise.

I'm not exactly clear on what you're suggesting here-- are you saying setup a limit and repeatedly run this where each run finishes under 1 second? Something like running,

UPDATE image JOIN ipblocks ON img_user_text = ipb_address
SET img_deleted = img_deleted | 12
WHERE ipb_deleted AND img_deleted = 0
LIMIT <something sane that finishes under 1 second>;

Will apply the change, report back when finished.

I'm not exactly clear on what you're suggesting here-- are you saying setup a limit and repeatedly run this where each run finishes under 1 second? Something like running,
UPDATE image JOIN ipblocks ON img_user_text = ipb_address
SET img_deleted = img_deleted | 12
WHERE ipb_deleted AND img_deleted = 0
LIMIT <something sane that finishes under 1 second>;

That would probably work. If you want to be 100% sure of changes being applied in less than 1 second, page using the primary key (LIMIT could take sometimes too much time if a full table scan or large range is done). Take a 1 second pause between updates or monitor the lag during its application. If that is too difficult, ask for help to the DBAs after code is applied.

@csteipp I've deployed it to testwiki. Sincerely, this seems like a very strightforward change (except for large tables), but if you want to do a final check or test my previous comment suggestions there (for example, 1000 rows updated at a time is a good starting point), I will wait a day before applying it to the rest of the wikis.

Deployment update:

The schema change was applied to all wikis in an online fashion (pt-table-checksum), with the exception of commonswiki. The reason for that is that due to the high SELECT rate of that table on commons, the change created high contention due to metadata locking, so I stopped and rolled back the change. Please note that the issue here is the schema change, not the resulting schema.

I am now applying the change in a rolling-update fashion, depooling and repooling one server at a time -hopefully that will avoid the load issues. The issue with this method is that the change takes between 2-4 hours per server and has to be mostly serialized, so it will probably take me over a day to complete it (but is is ongoing right now).

• csteipp added a parent task: T108064: MediaWiki Security release 1.25.3.Aug 5 2015, 5:17 PM

• csteipp removed a parent task: Restricted Task.

ALTER TABLE image ADD img_deleted tinyint unsigned NOT NULL default 0; (or more precisely, a complex list of operations with the same result but without affecting production) has been run on ALL WMF wikis.

I suppose that concludes the first step.

Be careful with Commons: there, image is a very busy 30-million row table, and even simple operations can generate lag.

Tgr mentioned this in T109153: Consider removing selective undelete links from files.Aug 14 2015, 11:11 PM

In T90300#1514580, @jcrespo wrote:

ALTER TABLE image ADD img_deleted tinyint unsigned NOT NULL default 0; (or more precisely, a complex list of operations with the same result but without affecting production) has been run on ALL WMF wikis.

You missed labswiki, which I just did (as there's only 260 rows there).

In my search to find out why this column had appeared on some wikis but not labswiki, I came across T38497 and T22742 which also suggested adding this column years ago.

• csteipp moved this task from Waiting to Ready on the Security-Team board.Aug 18 2015, 11:57 PM

• csteipp added a project: Vuln-Infoleak.Aug 20 2015, 9:18 PM

Jdforrester-WMF moved this task from Untriaged to Backlog on the Multimedia board.Sep 4 2015, 6:25 PM

• csteipp added a parent task: T115722: MediaWiki Security release 1.25.4.Oct 16 2015, 4:47 PM

• csteipp removed a parent task: T108064: MediaWiki Security release 1.25.3.

• demon removed a parent task: T115722: MediaWiki Security release 1.25.4.Dec 16 2015, 11:43 PM

• chasemp edited projects, added Security-team-backlog; removed Cloud-Services, Notice, Security-Team.Sep 4 2018, 3:23 PM

Restricted Application added a project: Commons. · View Herald TranscriptSep 4 2018, 3:23 PM

@Tgr Just to clarify, are you still working on this/Does anyone know the status?

I think this was just waiting for the patches to be deployed (but that was 3 years ago so they need rebase / re-review now), not sure where / why exactly that got stuck. Are you still interested in getting it out? I'll be mostly on vacation until end of next week, can look at it after that.

• chasemp edited projects, added Security-Team; removed Security-team-backlog.Dec 23 2019, 5:12 PM

• chasemp moved this task from Ready to Back Orders on the Security-Team board.

Restricted Application added a project: Platform Engineering. · View Herald TranscriptDec 23 2019, 5:12 PM

• chasemp moved this task from Back Orders to Watching on the Security-Team board.Dec 23 2019, 5:21 PM

bd808 removed a project: Toolforge.Dec 24 2019, 10:03 PM

Anomie removed a project: Platform Engineering.Jan 2 2020, 3:46 PM

• chasemp added a project: Security.Feb 10 2020, 10:59 PM

• chasemp removed a project: acl*security.Feb 20 2020, 8:08 PM

This task has been assigned to the same task owner for more than two years. Resetting task assignee due to inactivity, to decrease task cookie-licking and to get a slightly more realistic overview of plans. Please feel free to assign this task to yourself again if you still realistically work or plan to work on this task - it would be welcome!

For tips how to manage individual work in Phabricator (noisy notifications, lists of task, etc.), see https://phabricator.wikimedia.org/T228575#6237124 for available options.
(For the records, two emails were sent to assignee addresses before resetting assignees. See T228575 for more info and for potential feedback. Thanks!)

@sbassett is the Security team interested in seeing this through? If not, can it be made public? (It's an infoleak, but a very minor one, and not triggered by user action, so public knowledge of the bug wouldn't make much difference.)
It was a nontrivial amount of work, and the patch clears up a significant amount of tech debt in the image table, so I'd like to see it through one way or another (now that I remember it exists... thanks for the reminder, @Aklapper).

@Tgr - moving this to our clinic meeting today. If the work can go through gerrit sooner than later, that sounds like the easier path to get it deployed, especially for what I'd personally consider a fairly low-risk vulnerability.

@Tgr - Security-Team is fine with this being public, given what we perceive to be a fairly minor information disclosure, especially if the work can proceed sooner than later within gerrit. I'll make this public now.

sbassett changed the visibility from "Custom Policy" to "Public (No Login Required)".Jul 6 2020, 3:24 PM

sbassett moved this task from Incoming to Watching on the Security-Team board.

Change 609798 had a related patch set uploaded (by Reedy; owner: CSteipp):
[mediawiki/core@master] Add img_deleted column

https://gerrit.wikimedia.org/r/609798

DannyS712 subscribed.Jul 6 2020, 3:37 PM

Reedy mentioned this in T257222: Add img_deleted column.Jul 6 2020, 3:41 PM

Seems the img_deleted column was only recently removed from WMF wikis... T250055: Remove image.img_deleted column from production

In T90300#6282177, @sbassett wrote:

@Tgr - Security-Team is fine with this being public, given what we perceive to be a fairly minor information disclosure, especially if the work can proceed sooner than later within gerrit. I'll make this public now.

Thanks! Will try to re-test things in the next week or two.

taavi subscribed.Jul 6 2020, 8:14 PM

Aklapper removed a subscriber: Anomie.Oct 16 2020, 5:02 PM

Tgr mentioned this in T38497: Hiding file upload log entry should also hide the contents of the "Summary" box on file pages.Nov 9 2020, 5:10 AM

Tgr mentioned this in T28741: Migrate file tables to a modern layout (image/oldimage; file/file_revision; add primary keys).Jan 14 2021, 5:44 AM

Izno removed a parent task: T51188: [DO NOT USE] Schema changes for Wikimedia wikis (tracking) [superseded by #Blocked-on-schema-change].Apr 27 2021, 5:59 PM

Izno subscribed.

Izno unsubscribed.

Zabe subscribed.Jun 20 2021, 9:22 PM

Umherirrender mentioned this in T288327: Allow to hide upload comment for the current file revision.Aug 6 2021, 1:57 PM

Aklapper changed the edit policy from "Custom Policy" to "All Users".Jan 23 2023, 8:06 PM

Change #609798 abandoned by Reedy:

[mediawiki/core@master] Add img_deleted column