[go: up one dir, main page]

Jump to content

Server Admin Log

From Wikitech
(Redirected from Server admin log)

2024-12-02

  • 15:47 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 15:46 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 15:42 volans: uploaded spicerack_9.0.0 to apt.wikimedia.org bullseye-wikimedia
  • 15:42 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 15:42 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 15:32 taavi@deploy2002: Finished scap sync-world: Backport for wikitech: Drop contentadmin group (T375950) (duration: 09m 42s)
  • 15:29 sukhe: sudo cumin -b1 -s10 "A:cp" 'run-puppet-agent --enable "merging CR 1091748"'
  • 15:26 taavi@deploy2002: taavi: Continuing with sync
  • 15:26 taavi@deploy2002: taavi: Backport for wikitech: Drop contentadmin group (T375950) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:24 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet [reason: [done] testing CR 1091748]
  • 15:22 taavi@deploy2002: Started scap sync-world: Backport for wikitech: Drop contentadmin group (T375950)
  • 15:17 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet [reason: testing CR 1091748]
  • 15:14 sukhe: sudo cumin "A:cp" 'disable-puppet "merging CR 1091748"' [trafficserver: remove inbound TLS and related settings]
  • 15:08 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1006.eqiad.wmnet with OS bookworm
  • 15:03 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kubernetes1018.eqiad.wmnet wikikube-worker1006.eqiad.wmnet on all recursors
  • 15:03 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache kubernetes1018.eqiad.wmnet wikikube-worker1006.eqiad.wmnet on all recursors
  • 15:01 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1018 to wikikube-worker1006
  • 15:01 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1006
  • 14:59 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1006
  • 14:59 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:58 marostegui: Deploy schema change on db1167 dbmaint eqiad - s8 sanitarium master, there will be days of lag in wikireplicas in s8 T367856
  • 14:57 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:55 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:53 sukhe@cumin1002: START - Cookbook sre.dns.netbox
  • 14:50 sukhe: running authdns-update for CR 1099713
  • 14:44 jelto@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 14:43 urbanecm@deploy2002: Finished scap sync-world: Backport for [Growth] testwiki: Enable Surfacing structured tasks (T379976), Prepare for surfacing structured tasks (squashed) (T379976) (duration: 19m 08s)
  • 14:36 urbanecm@deploy2002: migr, urbanecm: Continuing with sync
  • 14:34 moritzm: installing curl security updates
  • 14:29 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 14:29 jiji@cumin1002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts mc-gp[1001-1003].eqiad.wmnet
  • 14:29 jiji@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 14:28 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1018 to wikikube-worker1006
  • 14:27 urbanecm@deploy2002: migr, urbanecm: Backport for [Growth] testwiki: Enable Surfacing structured tasks (T379976), Prepare for surfacing structured tasks (squashed) (T379976) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:27 jiji@cumin1002: START - Cookbook sre.dns.netbox
  • 14:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 14:24 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:24 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 14:23 urbanecm@deploy2002: Started scap sync-world: Backport for [Growth] testwiki: Enable Surfacing structured tasks (T379976), Prepare for surfacing structured tasks (squashed) (T379976)
  • 14:17 urbanecm@deploy2002: Finished scap sync-world: Backport for Drop $wgWikimediaCampaignEventsEnableCommunityList (T380075) (duration: 14m 37s)
  • 14:11 urbanecm@deploy2002: urbanecm, daimona: Continuing with sync
  • 14:08 jiji@cumin1002: START - Cookbook sre.hosts.decommission for hosts mc-gp[1001-1003].eqiad.wmnet
  • 14:07 urbanecm@deploy2002: urbanecm, daimona: Backport for Drop $wgWikimediaCampaignEventsEnableCommunityList (T380075) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:03 urbanecm@deploy2002: Started scap sync-world: Backport for Drop $wgWikimediaCampaignEventsEnableCommunityList (T380075)
  • 14:00 moritzm: removing ganeti1020 from active Ganeti nodes T378921
  • 13:57 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for kafka-main1007.eqiad.wmnet
  • 13:57 jiji@cumin1002: START - Cookbook sre.hosts.remove-downtime for kafka-main1007.eqiad.wmnet
  • 13:51 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on kafka-main[1003,1008].eqiad.wmnet with reason: Hardware refresh
  • 13:51 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on kafka-main[1003,1008].eqiad.wmnet with reason: Hardware refresh
  • 13:46 marostegui@cumin1002: dbctl commit (dc=all): 'db1198 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P71471 and previous config saved to /var/cache/conftool/dbconfig/20241202-134648-root.json
  • 13:46 isaranto@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 13:42 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 13:42 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 13:41 isaranto@deploy2002: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 13:37 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on kafka-main[1002,1007].eqiad.wmnet with reason: Hardware refresh
  • 13:37 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on kafka-main[1002,1007].eqiad.wmnet with reason: Hardware refresh
  • 13:31 marostegui@cumin1002: dbctl commit (dc=all): 'db1198 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P71470 and previous config saved to /var/cache/conftool/dbconfig/20241202-133143-root.json
  • 13:31 effie: repacing kafka-main1003 in production with kafka-main1008 - T363214
  • 13:30 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes1018.eqiad.wmnet
  • 13:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:29 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes1018.eqiad.wmnet
  • 13:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:24 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc-gp[2002-2003].codfw.wmnet
  • 13:24 jiji@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:24 jiji@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc-gp[2002-2003].codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1002"
  • 13:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:21 jiji@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc-gp[2002-2003].codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1002"
  • 13:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:18 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1005.eqiad.wmnet
  • 13:18 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1005.eqiad.wmnet
  • 13:17 jiji@cumin1002: START - Cookbook sre.dns.netbox
  • 13:16 marostegui@cumin1002: dbctl commit (dc=all): 'db1198 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P71469 and previous config saved to /var/cache/conftool/dbconfig/20241202-131638-root.json
  • 13:06 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 13:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 13:01 jiji@cumin1002: START - Cookbook sre.hosts.decommission for hosts mc-gp[2002-2003].codfw.wmnet
  • 13:01 marostegui@cumin1002: dbctl commit (dc=all): 'db1198 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P71467 and previous config saved to /var/cache/conftool/dbconfig/20241202-130132-root.json
  • 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1020.eqiad.wmnet
  • 12:22 topranks: re-routing traffic from Drmrs towards TECHLIB-TCZ - AS2852 - National Library of Technology, Prague, to avoid path via GEANT
  • 12:18 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2005-2006].codfw.wmnet
  • 12:18 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2005-2006].codfw.wmnet
  • 12:13 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc-gp2001.codfw.wmnet
  • 12:13 jiji@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:13 jiji@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc-gp2001.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1002"
  • 12:06 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2005.codfw.wmnet with OS bookworm
  • 12:05 jiji@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc-gp2001.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1002"
  • 12:04 jelto: homer 'cr*eqiad*' commit 'T377876'
  • 12:02 jiji@cumin1002: START - Cookbook sre.dns.netbox
  • 12:01 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1005.eqiad.wmnet with OS bookworm
  • 11:57 moritzm: upload mapnik 4.0.3+ds-2~wmf12u2 (adding a forward ported mapnik-config script to be consumed by node-mapnik even with the switch of mapnik 4 towards pkg-config) T327396
  • 11:56 jiji@cumin1002: START - Cookbook sre.hosts.decommission for hosts mc-gp2001.codfw.wmnet
  • 11:56 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2006.codfw.wmnet with OS bookworm
  • 11:55 marostegui: Stop mariadb on es2020 to clone es2041 T381259
  • 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ms-be1070.eqiad.wmnet
  • 11:52 mvernon@cumin2002: START - Cookbook sre.hosts.remove-downtime for ms-be1070.eqiad.wmnet
  • 11:46 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2005.codfw.wmnet with reason: host reimage
  • 11:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ms-be1070.eqiad.wmnet with reason: vacuum two overlarge container dbs
  • 11:45 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on ms-be1070.eqiad.wmnet with reason: vacuum two overlarge container dbs
  • 11:42 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1005.eqiad.wmnet with reason: host reimage
  • 11:42 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2005.codfw.wmnet with reason: host reimage
  • 11:38 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1005.eqiad.wmnet with reason: host reimage
  • 11:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es2020.codfw.wmnet with reason: cloning
  • 11:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on es2020.codfw.wmnet with reason: cloning
  • 11:36 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2006.codfw.wmnet with reason: host reimage
  • 11:33 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2006.codfw.wmnet with reason: host reimage
  • 11:26 ladsgroup@deploy2002: Finished scap sync-world: Backport for Translate: Disable message group subscription feature for legalteamwiki (T372386 T381250) (duration: 11m 21s)
  • 11:23 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2005
  • 11:23 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2005
  • 11:23 topranks: rollback OSPF metric change on cr4-ulsfo to place all codfw to eqsin traffic back on primary transport link
  • 11:22 jelto@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker1005.eqiad.wmnet with OS bookworm
  • 11:21 marostegui@cumin2002: dbctl commit (dc=all): 'Depool es2020 T381259', diff saved to https://phabricator.wikimedia.org/P71463 and previous config saved to /var/cache/conftool/dbconfig/20241202-112105-marostegui.json
  • 11:19 ladsgroup@deploy2002: abi, ladsgroup: Continuing with sync
  • 11:19 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2005
  • 11:19 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2005.codfw.wmnet 40.32.192.10.in-addr.arpa 0.4.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:19 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2005.codfw.wmnet 40.32.192.10.in-addr.arpa 0.4.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:19 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:19 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2005 - jayme@cumin2002"
  • 11:19 ladsgroup@deploy2002: abi, ladsgroup: Backport for Translate: Disable message group subscription feature for legalteamwiki (T372386 T381250) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:19 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2005 - jayme@cumin2002"
  • 11:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 11:15 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2088.codfw.wmnet with OS bullseye
  • 11:15 jelto@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kubernetes1017.eqiad.wmnet wikikube-worker1005.eqiad.wmnet on all recursors
  • 11:15 ladsgroup@deploy2002: Started scap sync-world: Backport for Translate: Disable message group subscription feature for legalteamwiki (T372386 T381250)
  • 11:14 jelto@cumin1002: START - Cookbook sre.dns.wipe-cache kubernetes1017.eqiad.wmnet wikikube-worker1005.eqiad.wmnet on all recursors
  • 11:14 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 11:14 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2005
  • 11:13 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2006
  • 11:13 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2006
  • 11:13 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2006
  • 11:13 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2006.codfw.wmnet 141.32.192.10.in-addr.arpa 1.4.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:13 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2006.codfw.wmnet 141.32.192.10.in-addr.arpa 1.4.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:13 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:13 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2006 - jayme@cumin2002"
  • 11:13 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2006 - jayme@cumin2002"
  • 11:09 ladsgroup@deploy2002: Started scap sync-world: Backport for Translate: Disable message group subscription feature for legalteamwiki (T372386 T381250)
  • 11:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 11:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:07 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 11:07 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes1017 to wikikube-worker1005
  • 11:06 jelto@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1005
  • 11:05 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 11:05 jelto@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1005
  • 11:05 jelto@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:05 jelto@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1017 to wikikube-worker1005 - jelto@cumin1002"
  • 11:04 jelto@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes1017 to wikikube-worker1005 - jelto@cumin1002"
  • 11:02 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2005.codfw.wmnet with OS bookworm
  • 11:02 jayme@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2006
  • 11:02 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2006.codfw.wmnet with OS bookworm
  • 11:01 jayme@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2005.codfw.wmnet wikikube-worker2006.codfw.wmnet on all recursors
  • 11:01 jayme@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2005.codfw.wmnet wikikube-worker2006.codfw.wmnet on all recursors
  • 11:00 jelto@cumin1002: START - Cookbook sre.dns.netbox
  • 11:00 jelto@cumin1002: START - Cookbook sre.hosts.rename from kubernetes1017 to wikikube-worker1005
  • 10:55 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2437 to wikikube-worker2006
  • 10:55 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2006
  • 10:54 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2006
  • 10:54 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:54 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2437 to wikikube-worker2006 - jayme@cumin2002"
  • 10:54 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2437 to wikikube-worker2006 - jayme@cumin2002"
  • 10:52 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2436 to wikikube-worker2005
  • 10:52 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2088.codfw.wmnet with reason: host reimage
  • 10:51 jayme@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2005
  • 10:51 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:51 jayme@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2005
  • 10:51 jayme@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:51 jayme@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2436 to wikikube-worker2005 - jayme@cumin2002"
  • 10:50 jayme@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2436 to wikikube-worker2005 - jayme@cumin2002"
  • 10:48 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2088.codfw.wmnet with reason: host reimage
  • 10:46 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2437 to wikikube-worker2006
  • 10:46 jayme@cumin2002: START - Cookbook sre.dns.netbox
  • 10:46 jayme@cumin2002: START - Cookbook sre.hosts.rename from mw2436 to wikikube-worker2005
  • 10:45 jelto@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes1017.eqiad.wmnet
  • 10:45 jelto@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes1017.eqiad.wmnet
  • 10:44 ladsgroup@deploy2002: Finished scap sync-world: Backport for Enable new ParserCache key schema on every page (T373037) (duration: 17m 25s)
  • 10:38 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1017.eqiad.wmnet with OS bookworm
  • 10:37 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 10:35 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be2088.codfw.wmnet with OS bullseye
  • 10:33 ladsgroup@deploy2002: ladsgroup: Backport for Enable new ParserCache key schema on every page (T373037) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 10:32 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
  • 10:32 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
  • 10:31 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2087.codfw.wmnet with OS bullseye
  • 10:26 ladsgroup@deploy2002: Started scap sync-world: Backport for Enable new ParserCache key schema on every page (T373037)
  • 10:16 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2437.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 10:16 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2436.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 10:16 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1017.eqiad.wmnet with reason: host reimage
  • 10:12 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1017.eqiad.wmnet with reason: host reimage
  • 10:12 marostegui: Deploy schema change on db1167 - s8 sanitarium master, there will be days of lag in wikireplicas in s8 T367856
  • 10:12 marostegui@cumin2002: dbctl commit (dc=all): 'Depool db1167 for an alter table', diff saved to https://phabricator.wikimedia.org/P71461 and previous config saved to /var/cache/conftool/dbconfig/20241202-101225-marostegui.json
  • 10:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: alter
  • 10:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: alter
  • 10:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 8:00:00 on db1167.eqiad.wmnet with reason: alter
  • 10:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 3 days, 8:00:00 on db1167.eqiad.wmnet with reason: alter
  • 10:09 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2087.codfw.wmnet with reason: host reimage
  • 10:05 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2087.codfw.wmnet with reason: host reimage
  • 10:04 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2437.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 10:03 jayme@cumin2002: START - Cookbook sre.hosts.provision for host mw2436.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:56 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host pc1017.eqiad.wmnet with OS bookworm
  • 09:52 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on mw[2436-2437].codfw.wmnet with reason: rename/reimage
  • 09:52 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on mw[2436-2437].codfw.wmnet with reason: rename/reimage
  • 09:52 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be2087.codfw.wmnet with OS bullseye
  • 09:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:48 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw[2436-2437].codfw.wmnet
  • 09:47 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw[2436-2437].codfw.wmnet
  • 09:45 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2086.codfw.wmnet with OS bullseye
  • 09:45 elukey@deploy2002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 09:45 elukey@deploy2002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: sync
  • 09:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: optimizing
  • 09:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: optimizing
  • 09:42 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 09:41 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: sync
  • 09:39 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:39 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:35 marostegui: Installing mariadb 10.6.20 on db1198 T378940
  • 09:28 marostegui@cumin2002: dbctl commit (dc=all): 'Depoll db1198 to install 10.6.20', diff saved to https://phabricator.wikimedia.org/P71460 and previous config saved to /var/cache/conftool/dbconfig/20241202-092854-marostegui.json
  • 09:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1198.eqiad.wmnet with reason: testing
  • 09:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db1198.eqiad.wmnet with reason: testing
  • 09:24 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2086.codfw.wmnet with reason: host reimage
  • 09:20 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2086.codfw.wmnet with reason: host reimage
  • 09:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1020.eqiad.wmnet
  • 09:09 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be2086.codfw.wmnet with OS bullseye
  • 08:59 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2086.codfw.wmnet with OS bullseye
  • 08:52 dcausse: restarting blazegraph on wdqs1019 (BlazegraphFreeAllocatorsDecreasingRapidly)
  • 08:49 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:49 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:47 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:47 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:45 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:44 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:42 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:42 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:36 kartik@deploy2002: Finished scap sync-world: Backport for Translate: Enable message group subscription feature for some wikis (T372386) (duration: 23m 39s)
  • 08:35 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host ms-be2086.codfw.wmnet with OS bullseye
  • 08:29 kartik@deploy2002: abi, kartik: Continuing with sync
  • 08:25 kartik@deploy2002: abi, kartik: Backport for Translate: Enable message group subscription feature for some wikis (T372386) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:19 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:19 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:19 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 08:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 08:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1020.eqiad.wmnet
  • 08:12 kartik@deploy2002: Started scap sync-world: Backport for Translate: Enable message group subscription feature for some wikis (T372386)
  • 08:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1020.eqiad.wmnet
  • 08:09 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1009.eqiad.wmnet
  • 08:07 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1009.eqiad.wmnet
  • 08:00 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on pc1017.eqiad.wmnet with reason: T378068, host is not pooled
  • 08:00 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on pc1017.eqiad.wmnet with reason: T378068, host is not pooled
  • 08:00 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on pc1013.eqiad.wmnet with reason: T373037, host is not pooled
  • 08:00 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on pc1013.eqiad.wmnet with reason: T373037, host is not pooled
  • 05:20 TimStarling: foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol=https
  • 05:14 TimStarling: on mwmaint2002: mwscript extensions/Wikibase/lib/maintenance/populateSitesTable.php --wiki=idwikivoyage --force-protocol=https
  • 04:41 TimStarling: installed id.wikivoyage.org
  • 04:39 TimStarling: on db2123: grant alter ON `%wik%`.* TO `wikiadmin2023`@`10.%`
  • 04:26 tstarling@deploy2002: Finished scap sync-world: Backport for Create id.wikivoyage.org (T380726 T352113), Add messages for Indonesian Wikivoyage (idwikivoyage) (T380726) (duration: 31m 05s)
  • 04:13 tstarling@deploy2002: tstarling: Continuing with sync
  • 04:12 tstarling@deploy2002: tstarling: Backport for Create id.wikivoyage.org (T380726 T352113), Add messages for Indonesian Wikivoyage (idwikivoyage) (T380726) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 03:55 tstarling@deploy2002: Started scap sync-world: Backport for Create id.wikivoyage.org (T380726 T352113), Add messages for Indonesian Wikivoyage (idwikivoyage) (T380726)

2024-12-01

  • 23:53 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 23:52 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 23:52 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 23:52 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 23:52 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 23:52 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 23:50 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 23:50 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 23:50 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 23:50 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 23:50 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 23:50 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 13:17 ladsgroup@cumin1002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1156 gradually with 4 steps - Maint over (T381213)
  • 13:02 ladsgroup@cumin1002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1233 gradually with 4 steps - Maint over (T381213)
  • 12:31 ladsgroup@cumin1002: START - Cookbook sre.mysql.pool db1156 gradually with 4 steps - Maint over (T381213)
  • 12:16 ladsgroup@cumin1002: START - Cookbook sre.mysql.pool db1233 gradually with 4 steps - Maint over (T381213)
  • 12:10 ladsgroup@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1156.eqiad.wmnet onto db1233.eqiad.wmnet
  • 10:45 ladsgroup@cumin1002: START - Cookbook sre.mysql.clone of db1156.eqiad.wmnet onto db1233.eqiad.wmnet
  • 10:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depool to reclone (T381213)', diff saved to https://phabricator.wikimedia.org/P71451 and previous config saved to /var/cache/conftool/dbconfig/20241201-104441-ladsgroup.json
  • 06:18 marostegui@cumin2002: dbctl commit (dc=all): 'Depoll db1233', diff saved to https://phabricator.wikimedia.org/P71450 and previous config saved to /var/cache/conftool/dbconfig/20241201-061841-marostegui.json


Archives

See Server Admin Log/Archives.