This task is tracking the removal of Grid Engine-specific bits from the Toolforge infrastructure. It is effectively blocked on migrating tools off the grid, which is tracked separately as T313405: Migrate remaining tools off Gridengine.
Before Thursday shutdown window:
- Stop alerting
Plan for Thursday (C10):
- Run disable script for remaining tools to archive crontabs and add README disabled files
- Shut down VMs P58776 SGE nodes to stop
- cat list-of-vms | xargs sudo OS_PROJECT_ID=tools wmcs-openstack server stop
- Merge dynamicproxy patch https://gerrit.wikimedia.org/r/c/operations/puppet/+/1010503
- Merge disable-tool patch https://gitlab.wikimedia.org/repos/cloud/toolforge/disable-tool/-/merge_requests/18
- Celebrate!
Then afterwards:
- Remove remaining SGE mentions from Wikitech
- Rebuild bastions T314665: Toolforge: Replace all bastion with grid-less bookworm based bastion hosts
- Rebuild or otherwise replace checker hosts T313030: [toolforge.infra] Replace Toolschecker alerts with Prometheus based ones
- Update webservice to drop grid support https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/31
- Update src:toollabs, deploy updated misctools https://gerrit.wikimedia.org/r/c/labs/toollabs/+/1010517 https://gerrit.wikimedia.org/r/c/labs/toollabs/+/1010518
- Remove remaining Puppet classes
- Cleanup obsolete hiera keys
- Remove cookbooks https://gerrit.wikimedia.org/r/c/cloud/wmcs-cookbooks/+/1010520/
- Archive infra tools T359934
- Archive gerrit repos T359935
- Close remaining Grid-Engine-to-K8s-Migration tasks, archive project
- Delete shutdown instances
- Cleanup server and security groups
- Cleanup system directories from NFS
- Delete sgeadmin from LDAP