Final Plan:
We'll need to modify our workflow. Right now in RT, maint-annoucements come in multiple times for a single event. We'll typically get an initial notification of maintenance, then any modifications between initial notification and the event. Then we also tend to get reminders. Some vendors send all three of these, some send only one.
RT allows tickets to be merged, so we merge in the later tickets to the earlier ticket for the maintenance window. Merge is allows in Phab as well but works slightly differently.
- Add in the maint-announce project and have emails into it trigger creation of a new task with the SRE, maint-announce, and in the S6
- This includes adding the new #maint-annouce project to the herald to have SRE always travel with it.
- Vendors/Carriers/Datacenters/Peers email in notices.
- Ops Clinic Duty person triages incoming notices.
- If the notice is new they do the following:
- Add to Operations tracking google calendar.
- Include the circuit IDs and Task # in the entry.
- Move notice from backlog/new to 'on calendar' on workboard; this move is shown on the task details. (This workboard doesn't yet exist.)
- Stall task until end of the maint-window; then resolve task.
- If the notice is a followup to an existing task:
- The original task is merged into the new task as a duplicate.
- If the date/time changed, the google calendar entry is updated.
- Include the circuit IDs and Task # in the entry.
- Move notice from backlog/new to 'on calendar' on workboard; this move is shown on the task details. (This workboard doesn't yet exist.)
In RT we went with merging the newer tasks into the orignal, but RT merged in content. Since Phabricator does not, keeping the new task open and merging in the older task give us the more up to date information immediately without requiring anyone to manually copy over data from one task to another.
Considerations:
- we need to forward maint-announce@rt to maint-announce@phab
- This does not includes advance whitelisting the domains of our maint-announcement vendors. Since we are using a designated space for maint-announce items to contain them for search and security, and Phabricator natively provides the ability to specify a creation address per-space. We could in the future do this with custom pre-phabricator handling behavior, but that would be prone to far more breakage. We also do not try to whitelist senders in RT currently and it seems prudent to migrate the existing restrictions rather than do it all at once. This ability did not exist when the initial conversations happened many months ago so this is a slight rethinking.
- we could later move this to phab's internal calendar
- #acl*operations-team is being used to secure the space and also is applied to tasks incoming to the space by default
This task will detail the overall migration of the maint-announce queue in RT into phabricator.
The maint-announces come in via an alias, and then are piped into RT. Once in RT, the ops clinic person triages announcements and ensures they are placed on the operations tracking gcal.
We'll need to relocate this queue/project into phabricator, as it is the last remaining use of RT.
Currently, maintainance notifications are triaged by our ops clinic duty person for the week. Their steps are detailed on https://wikitech.wikimedia.org/wiki/Ops_Clinic_Duty#Responsibilities
They include:
- Maintain the 'maint-announce' queue and calendar:
- This is the ONLY RT queue left for Ops Clinic Duty coverage.
- Modify ticket Subject to prepend dates of effect in big-endian order (ex: 2014-11-06 to 2014-11-09: Equinix chiller maintenance)
- Merge follow-up tickets as needed so that there is one per maintenance event
- There is [https://office.wikimedia.org/wiki/Office_IT/Calendars#Human_calendars a gcal shared with all WMF named 'Ops maintenance & contracts']. All maint-announce queue tickets should be entered into this calendar.
- Include the circuit IDs and RT#s in the entry. (See entries on 2014-10-07 for examples.)
- Update the ticket in RT from 'new' to 'open' and comment that it has been added to the ops tracking gcal.
Please note that when this is done, RT can be made read-only and all mail relays from RT killed (NOT FORWARDED).
The maint-requests go to aliases, not to RT, they can be redirected to phabricator.