[go: up one dir, main page]

Page MenuHomePhabricator

gmodena (GModena (WMF))
User

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Thursday

  • Clear sailing ahead.

User Details

User Since
Nov 2 2020, 1:15 PM (203 w, 17 h)
Availability
Available
IRC Nick
gmodena
LDAP User
Gmodena
MediaWiki User
GModena (WMF) [ Global Accounts ]

Recent Activity

Thu, Sep 19

gmodena updated the task description for T345195: [Data Quality] [SPIKE] Can we identify indicators to inform an SLO for event emission and intake?.
Thu, Sep 19, 1:56 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board), Event-Platform
gmodena added a comment to T345195: [Data Quality] [SPIKE] Can we identify indicators to inform an SLO for event emission and intake?.

I resumed working on this now that we have instrumentation for EventBus.

Thu, Sep 19, 1:55 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board), Event-Platform
gmodena created T375197: [Event Platform] We should alert on EventBus performance degradation..
Thu, Sep 19, 1:50 PM · Data-Engineering, Event-Platform
gmodena added a comment to T347484: [Event Platform] Can we import metrics from logstash to promethues?.

This task has been superseded by {T363587: [Event Platform] Instrument EventBus with prometheus MW Statslib}.

Thu, Sep 19, 1:05 PM · Data-Engineering, Event-Platform
gmodena closed T347484: [Event Platform] Can we import metrics from logstash to promethues?, a subtask of T345195: [Data Quality] [SPIKE] Can we identify indicators to inform an SLO for event emission and intake?, as Declined.
Thu, Sep 19, 1:03 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board), Event-Platform
gmodena closed T347484: [Event Platform] Can we import metrics from logstash to promethues? as Declined.
Thu, Sep 19, 1:03 PM · Data-Engineering, Event-Platform
gmodena moved T345195: [Data Quality] [SPIKE] Can we identify indicators to inform an SLO for event emission and intake? from Next Up to In progress on the Data-Engineering (Q1 2024 July 1st - September 30th) board.
Thu, Sep 19, 1:00 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board), Event-Platform
gmodena moved T345195: [Data Quality] [SPIKE] Can we identify indicators to inform an SLO for event emission and intake? from SDS3.3 - Data Quality to Q1 2024 July 1st - September 30th on the Data-Engineering board.
Thu, Sep 19, 12:59 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board), Event-Platform
gmodena moved T368787: Flink job to enrich reconciliation events from In progress to In Review on the Data-Engineering (Q1 2024 July 1st - September 30th) board.
Thu, Sep 19, 12:01 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review, Dumps 2.0 (Kanban Board)
gmodena moved T368787: Flink job to enrich reconciliation events from In Process to Code Review / Tech Input on the Dumps 2.0 (Kanban Board) board.
Thu, Sep 19, 11:38 AM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review, Dumps 2.0 (Kanban Board)
gmodena created T375176: Enable HA for the mw-dump-rev-content-reconcile-enrich flink application.
Thu, Sep 19, 11:35 AM · Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board)
gmodena updated the task description for T368787: Flink job to enrich reconciliation events.
Thu, Sep 19, 11:27 AM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review, Dumps 2.0 (Kanban Board)
gmodena added a comment to T366836: Migrate Event Platform Schema Respositories to Gitlab.

@gmodena , We have the mediawiki group and wmde group (recently added by developer-experience) available in gitlab. So I have added them to the members of the Secondary event schema repo in Gitlab and given them the developer role.

So everyone in that group should be able to clone the repo and create an MR directly and merge once approved. No need to fork + MR.

Let me know if this is okay.

Thu, Sep 19, 8:35 AM · Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th), Event-Platform
gmodena updated the task description for T368787: Flink job to enrich reconciliation events.
Thu, Sep 19, 8:31 AM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review, Dumps 2.0 (Kanban Board)

Mon, Sep 16

gmodena added a comment to T363587: [Event Platform] Instrument EventBus with prometheus MW Statslib.

Change #1070252 merged by jenkins-bot:

[mediawiki/extensions/EventBus@master] Fix buggy events_accepted_total metric

https://gerrit.wikimedia.org/r/1070252

Mon, Sep 16, 6:30 PM · MW-1.43-notes (1.43.0-wmf.22; 2024-09-10), Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board), Event-Platform
gmodena added a project to T374811: Prepare EventBus for temp accounts: Event-Platform.
Mon, Sep 16, 10:01 AM · Event-Platform, Data-Engineering, Temporary accounts

Fri, Sep 13

gmodena added a comment to T366836: Migrate Event Platform Schema Respositories to Gitlab.

We mirror repos (read-only) from Gerrit to GitHub, is there a way we can do the same to Gitlab? That could help soft landing this roll out.

Fri, Sep 13, 11:39 AM · Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th), Event-Platform
gmodena added a comment to T366836: Migrate Event Platform Schema Respositories to Gitlab.

@Snwachukwu might you need inspiration on drafting a contribution policy for the repos your are migrating, maybe these could be of help: https://gitlab.wikimedia.org/repos/data-engineering/eventutilities-python/-/blob/main/CONTRIBUTING.md and https://www.mediawiki.org/wiki/Platform_Engineering_Team/Event_Platform_Value_Stream/Contribution

Fri, Sep 13, 10:42 AM · Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th), Event-Platform
gmodena added a comment to T374473: Prepare puppet configuration to send haproxy logs to haproxykafka socket.

@Fabfur thanks for the heads up. We'll need to coordinate a bit on roll out, because the previous ingestion and processing infra has been decommissioned as part of T372456: Rollback haproxy feed automated ingestion.

Fri, Sep 13, 8:01 AM · Data Products, Data-Engineering, Traffic

Wed, Sep 11

gmodena added a comment to T374473: Prepare puppet configuration to send haproxy logs to haproxykafka socket.

Are you planning to start shipping to Kafka anytime soon? It would be great to touch base and iron out the timelines in case, just to stay on the same page.

Wed, Sep 11, 9:46 AM · Data Products, Data-Engineering, Traffic

Tue, Sep 10

gmodena updated subscribers of T374341: [SPIKE] how can we support Spark producer/consumers in Event Platform.
Tue, Sep 10, 10:58 AM · Dumps 2.0, Data-Engineering, Event-Platform
gmodena updated the task description for T374341: [SPIKE] how can we support Spark producer/consumers in Event Platform.
Tue, Sep 10, 10:39 AM · Dumps 2.0, Data-Engineering, Event-Platform
gmodena added a comment to T374341: [SPIKE] how can we support Spark producer/consumers in Event Platform.

@dcausse @pfischer and I had a chat about this phab today. Here are some notes from our conversation,

Tue, Sep 10, 10:37 AM · Dumps 2.0, Data-Engineering, Event-Platform

Mon, Sep 9

gmodena added a project to T326875: Update Data Engineering-owned products that may be affected by IP Masking: Event-Platform.
Mon, Sep 9, 8:03 PM · Event-Platform, Data-Engineering, Temporary accounts
gmodena moved T366612: Publish Data Engineering maintained NodeJS packages to GitLab and use them in depender code from Next Up to In progress on the Data-Engineering (Q1 2024 July 1st - September 30th) board.
Mon, Sep 9, 2:17 PM · Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th)
gmodena moved T368782: MediaWiki Reconciliation API from In progress to Blocked/Paused on the Data-Engineering (Q1 2024 July 1st - September 30th) board.
Mon, Sep 9, 2:11 PM · Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board)
gmodena created T374359: Update eventutilities_python wrappers to support Flink 1.20.
Mon, Sep 9, 1:45 PM · Data-Engineering, Event-Platform
gmodena created T374341: [SPIKE] how can we support Spark producer/consumers in Event Platform.
Mon, Sep 9, 9:56 AM · Dumps 2.0, Data-Engineering, Event-Platform

Fri, Sep 6

gmodena added a comment to T374055: Spike: Figure out if we have everything we need in Spark to emit page_change_late events.

nit: this said, I would prefer not to use page_change to describe the Dumps 2.0 streams.

Generally +1. But, do you think we can use approximately the same data model, perhaps even using some of the MW state entity schema fragments? Where it is easy to do so anyway?

Fri, Sep 6, 6:33 AM · Dumps 2.0 (Kanban Board)

Thu, Sep 5

gmodena added a comment to T374055: Spike: Figure out if we have everything we need in Spark to emit page_change_late events.

And page_change_kind? and maybe changelog_kind?

Thu, Sep 5, 11:56 AM · Dumps 2.0 (Kanban Board)

Wed, Sep 4

gmodena created P68647 Page Change Reconciliation Event (delete action).
Wed, Sep 4, 2:16 PM
gmodena updated the task description for T368787: Flink job to enrich reconciliation events.
Wed, Sep 4, 2:04 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review, Dumps 2.0 (Kanban Board)
gmodena updated the task description for T368787: Flink job to enrich reconciliation events.
Wed, Sep 4, 2:02 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review, Dumps 2.0 (Kanban Board)
gmodena moved T345195: [Data Quality] [SPIKE] Can we identify indicators to inform an SLO for event emission and intake? from Sprint Backlog to In Process on the Dumps 2.0 (Kanban Board) board.
Wed, Sep 4, 12:59 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board), Event-Platform
gmodena moved T368782: MediaWiki Reconciliation API from In Process to Paused on the Dumps 2.0 (Kanban Board) board.
Wed, Sep 4, 12:58 PM · Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board)
gmodena moved T368787: Flink job to enrich reconciliation events from Sprint Backlog to In Process on the Dumps 2.0 (Kanban Board) board.
Wed, Sep 4, 12:58 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review, Dumps 2.0 (Kanban Board)
gmodena added a comment to T373086: PHP Warning: Stats: Cannot associate label keys with label values: Not all initialized labels have an assigned value..

Looks like the spike is only related to CirrusSearch?

yes this is my understanding as well, it is quite noisy so it might hide others, I'm shipping a fix so hopefully we'll get some more clarity soon.

Wed, Sep 4, 9:41 AM · MW-1.43-notes (1.43.0-wmf.21; 2024-09-03), Discovery-Search, CirrusSearch, Data-Engineering (Q1 2024 July 1st - September 30th), Event-Platform, Wikimedia-production-error
gmodena updated subscribers of T373086: PHP Warning: Stats: Cannot associate label keys with label values: Not all initialized labels have an assigned value..

Ack. Investigating.
My original patch only addressed EventBus. I can't seem to find any new warn entry for that service in logstash.

Wed, Sep 4, 8:37 AM · MW-1.43-notes (1.43.0-wmf.21; 2024-09-03), Discovery-Search, CirrusSearch, Data-Engineering (Q1 2024 July 1st - September 30th), Event-Platform, Wikimedia-production-error

Tue, Sep 3

gmodena updated Other Assignee for T368787: Flink job to enrich reconciliation events, added: xcollazo.
Tue, Sep 3, 3:26 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review, Dumps 2.0 (Kanban Board)
gmodena updated the task description for T363587: [Event Platform] Instrument EventBus with prometheus MW Statslib.
Tue, Sep 3, 1:25 PM · MW-1.43-notes (1.43.0-wmf.22; 2024-09-10), Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board), Event-Platform
gmodena updated the task description for T368787: Flink job to enrich reconciliation events.
Tue, Sep 3, 10:36 AM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review, Dumps 2.0 (Kanban Board)

Mon, Sep 2

gmodena added a comment to T368787: Flink job to enrich reconciliation events.

I suppose we should target running Flink reconciled enrichment job in dse-k8s at first anyway, even if we decide we want to run the job multi-DC in wikikube eventually. So let's go for it!

Mon, Sep 2, 11:06 AM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review, Dumps 2.0 (Kanban Board)

Wed, Aug 28

gmodena claimed T368787: Flink job to enrich reconciliation events.
Wed, Aug 28, 7:05 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review, Dumps 2.0 (Kanban Board)
gmodena updated Other Assignee for T368787: Flink job to enrich reconciliation events, added: gmodena.
Wed, Aug 28, 7:04 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review, Dumps 2.0 (Kanban Board)
gmodena moved T373086: PHP Warning: Stats: Cannot associate label keys with label values: Not all initialized labels have an assigned value. from In progress to Done on the Data-Engineering (Q1 2024 July 1st - September 30th) board.
Wed, Aug 28, 2:34 PM · MW-1.43-notes (1.43.0-wmf.21; 2024-09-03), Discovery-Search, CirrusSearch, Data-Engineering (Q1 2024 July 1st - September 30th), Event-Platform, Wikimedia-production-error
gmodena moved T372456: Rollback haproxy feed automated ingestion from In Review to Done on the Data-Engineering (Q1 2024 July 1st - September 30th) board.
Wed, Aug 28, 9:44 AM · Patch-For-Review, Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)
gmodena updated the task description for T372456: Rollback haproxy feed automated ingestion.
Wed, Aug 28, 9:42 AM · Patch-For-Review, Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)
gmodena added a comment to T372456: Rollback haproxy feed automated ingestion.

I don't know why matching on the job label alone did not work, by a more extensive match (all labels except instace, that is empty) did the trick:

Wed, Aug 28, 9:42 AM · Patch-For-Review, Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)
gmodena added a comment to T373086: PHP Warning: Stats: Cannot associate label keys with label values: Not all initialized labels have an assigned value..

@colewhite ack. Thanks for the heads up.

Wed, Aug 28, 9:33 AM · MW-1.43-notes (1.43.0-wmf.21; 2024-09-03), Discovery-Search, CirrusSearch, Data-Engineering (Q1 2024 July 1st - September 30th), Event-Platform, Wikimedia-production-error
gmodena moved T373086: PHP Warning: Stats: Cannot associate label keys with label values: Not all initialized labels have an assigned value. from Incoming (new tickets) to Q1 2024 July 1st - September 30th on the Data-Engineering board.
Wed, Aug 28, 8:53 AM · MW-1.43-notes (1.43.0-wmf.21; 2024-09-03), Discovery-Search, CirrusSearch, Data-Engineering (Q1 2024 July 1st - September 30th), Event-Platform, Wikimedia-production-error
gmodena moved T373086: PHP Warning: Stats: Cannot associate label keys with label values: Not all initialized labels have an assigned value. from Next Up to In progress on the Data-Engineering (Q1 2024 July 1st - September 30th) board.
Wed, Aug 28, 8:52 AM · MW-1.43-notes (1.43.0-wmf.21; 2024-09-03), Discovery-Search, CirrusSearch, Data-Engineering (Q1 2024 July 1st - September 30th), Event-Platform, Wikimedia-production-error
gmodena claimed T373086: PHP Warning: Stats: Cannot associate label keys with label values: Not all initialized labels have an assigned value..
Wed, Aug 28, 8:49 AM · MW-1.43-notes (1.43.0-wmf.21; 2024-09-03), Discovery-Search, CirrusSearch, Data-Engineering (Q1 2024 July 1st - September 30th), Event-Platform, Wikimedia-production-error

Tue, Aug 27

gmodena moved T367403: Validate CI integration so that Ci can release Maven artifacts on user's demand from In Review to Done on the Data-Engineering (Q1 2024 July 1st - September 30th) board.
Tue, Aug 27, 2:20 PM · Release-Engineering-Team (Radar), Data-Engineering (Q1 2024 July 1st - September 30th), Java-Scala-Standardization, Discovery-Search, Data-Platform-SRE
gmodena updated the task description for T370368: [NEEDS GROOMING] We should improve the code health of gobblin-wmf.
Tue, Aug 27, 9:51 AM · Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)
gmodena updated subscribers of T372456: Rollback haproxy feed automated ingestion.

f/up from a convo with @fgiunchedi in IRC.

Tue, Aug 27, 9:48 AM · Patch-For-Review, Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)
gmodena updated the task description for T372456: Rollback haproxy feed automated ingestion.
Tue, Aug 27, 9:36 AM · Patch-For-Review, Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)

Mon, Aug 26

gmodena updated the task description for T372456: Rollback haproxy feed automated ingestion.
Mon, Aug 26, 7:18 PM · Patch-For-Review, Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)

Aug 22 2024

gmodena created T373112: [NEEDS INVESTIGATION][BUG] eventutilities_python operator metrics.
Aug 22 2024, 1:59 PM · Data-Platform, Data-Engineering, Event-Platform
gmodena moved T372768: [BUG] MediawikiPageContentChangeEnrichAvailability is firing from In progress to In Review on the Data-Engineering (Q1 2024 July 1st - September 30th) board.
Aug 22 2024, 1:44 PM · Patch-For-Review, Dumps 2.0 (Kanban Board), Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)

Aug 20 2024

gmodena added a comment to T372768: [BUG] MediawikiPageContentChangeEnrichAvailability is firing .

The alert compares the ratio of flink_taskmanager_job_task_operator_event_process_function_events_out_total over flink_taskmanager_job_task_operator_event_process_function_events_in_total.

Aug 20 2024, 11:04 AM · Patch-For-Review, Dumps 2.0 (Kanban Board), Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)
gmodena updated subscribers of T372768: [BUG] MediawikiPageContentChangeEnrichAvailability is firing .
Aug 20 2024, 10:34 AM · Patch-For-Review, Dumps 2.0 (Kanban Board), Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)
gmodena added a comment to T372768: [BUG] MediawikiPageContentChangeEnrichAvailability is firing .

I did some investigation both on the affected Kafka topics and related hive event tables. Quick update with my findings so far.

Aug 20 2024, 10:34 AM · Patch-For-Review, Dumps 2.0 (Kanban Board), Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)
gmodena created P67396 page_content_change events out / in ratio.
Aug 20 2024, 9:57 AM
gmodena updated subscribers of T372456: Rollback haproxy feed automated ingestion.

Remove the Gobblin MapReduce job that loads Kafka topics.

Aug 20 2024, 9:43 AM · Patch-For-Review, Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)

Aug 19 2024

gmodena moved T372768: [BUG] MediawikiPageContentChangeEnrichAvailability is firing from Sprint Backlog to In Process on the Dumps 2.0 (Kanban Board) board.
Aug 19 2024, 6:53 PM · Patch-For-Review, Dumps 2.0 (Kanban Board), Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)
gmodena moved T372768: [BUG] MediawikiPageContentChangeEnrichAvailability is firing from Incoming to Kanban Board on the Dumps 2.0 board.
Aug 19 2024, 6:52 PM · Patch-For-Review, Dumps 2.0 (Kanban Board), Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)
gmodena created T372768: [BUG] MediawikiPageContentChangeEnrichAvailability is firing .
Aug 19 2024, 1:50 PM · Patch-For-Review, Dumps 2.0 (Kanban Board), Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)
gmodena moved T372456: Rollback haproxy feed automated ingestion from In progress to In Review on the Data-Engineering (Q1 2024 July 1st - September 30th) board.
Aug 19 2024, 1:45 PM · Patch-For-Review, Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)

Aug 16 2024

gmodena updated the task description for T372456: Rollback haproxy feed automated ingestion.
Aug 16 2024, 9:17 AM · Patch-For-Review, Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)

Aug 15 2024

gmodena added a comment to T363587: [Event Platform] Instrument EventBus with prometheus MW Statslib.

I'd also like to do 2 more things before we resolve this ticket:

Aug 15 2024, 12:48 PM · MW-1.43-notes (1.43.0-wmf.22; 2024-09-10), Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board), Event-Platform

Aug 14 2024

gmodena added a comment to T372456: Rollback haproxy feed automated ingestion.

@gmodena I just pushed https://gerrit.wikimedia.org/r/c/operations/puppet/+/1062707, it prob has to go before the .pull file is removed.

Aug 14 2024, 2:06 PM · Patch-For-Review, Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)
gmodena added a comment to T354255: Alert in need of triage: AlertLintProblem (instance localhost:9123).

Ack. Thanks for the heads up and CRs @fgiunchedi

Aug 14 2024, 11:11 AM · SRE Observability (FY2024/2025-Q1), sre-alert-triage
gmodena awarded T354255: Alert in need of triage: AlertLintProblem (instance localhost:9123) a Stroopwafel token.
Aug 14 2024, 11:09 AM · SRE Observability (FY2024/2025-Q1), sre-alert-triage
gmodena updated the task description for T372456: Rollback haproxy feed automated ingestion.
Aug 14 2024, 8:59 AM · Patch-For-Review, Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)
gmodena set the point value for T372456: Rollback haproxy feed automated ingestion to 1.
Aug 14 2024, 8:16 AM · Patch-For-Review, Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)
gmodena moved T372456: Rollback haproxy feed automated ingestion from Next Up to In progress on the Data-Engineering (Q1 2024 July 1st - September 30th) board.
Aug 14 2024, 8:15 AM · Patch-For-Review, Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)
gmodena created T372456: Rollback haproxy feed automated ingestion.
Aug 14 2024, 8:11 AM · Patch-For-Review, Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)

Jul 19 2024

gmodena added a comment to T346046: [Search Update Pipeline] Source streams for private wikis.

Couple of WIP patches up for discussion.

The second depends on the first.

@gmodena I like the way this is headed, but I'm not sure if following through (and fully deprecating EventBusFactory) is worth it. Let's get together and discuss.

Jul 19 2024, 10:52 AM · Data-Engineering (Q1 2024 July 1st - September 30th), MW-1.43-notes (1.43.0-wmf.15; 2024-07-23), Discovery-Search (Current work), CirrusSearch

Jul 18 2024

gmodena added a comment to T370368: [NEEDS GROOMING] We should improve the code health of gobblin-wmf.

When gobblin moves to airflow, we can use Artifact sync to deploy the jar, instead of relying on analytics/refinery.

Jul 18 2024, 11:14 AM · Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)
gmodena updated the task description for T370368: [NEEDS GROOMING] We should improve the code health of gobblin-wmf.
Jul 18 2024, 11:14 AM · Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)
gmodena added a comment to T363587: [Event Platform] Instrument EventBus with prometheus MW Statslib.

A dashboard for EventBus is available in Grafana

Jul 18 2024, 9:38 AM · MW-1.43-notes (1.43.0-wmf.22; 2024-09-10), Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board), Event-Platform
gmodena updated the task description for T363587: [Event Platform] Instrument EventBus with prometheus MW Statslib.
Jul 18 2024, 9:37 AM · MW-1.43-notes (1.43.0-wmf.22; 2024-09-10), Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board), Event-Platform

Jul 17 2024

gmodena created T370368: [NEEDS GROOMING] We should improve the code health of gobblin-wmf.
Jul 17 2024, 9:22 PM · Event-Platform, Data-Engineering (Q1 2024 July 1st - September 30th)
gmodena added a comment to T365005: Evaluate ESC and explore an alternative design..

After re-scoping both Config Store and MPIC, we decided to not move forward with refactoring ESC at this stage. A summary of this task is available on wikitech. A design doc is available at Stream Registry.

Jul 17 2024, 12:15 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Event-Platform
gmodena added a comment to T362785: Add host level instrumentation on webrequest.

DQ job and airflow dag have been updated. Deployment requires a new release of refinery-source and a dag deployment on analytics.

Jul 17 2024, 11:16 AM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review
gmodena updated the task description for T362785: Add host level instrumentation on webrequest.
Jul 17 2024, 11:14 AM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review
gmodena added a comment to T362783: Add instrumentation for actor signatures.

DQ job and airflow dag have been implemented. Deployment requires a new release of refinery-source and a dag deployment on analytics.

Jul 17 2024, 11:13 AM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review
gmodena updated the task description for T362783: Add instrumentation for actor signatures.
Jul 17 2024, 11:12 AM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review
gmodena moved T362785: Add host level instrumentation on webrequest from In Review to Ready to Deploy on the Data-Engineering (Q1 2024 July 1st - September 30th) board.
Jul 17 2024, 11:11 AM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review
gmodena moved T362783: Add instrumentation for actor signatures from In Review to Ready to Deploy on the Data-Engineering (Q1 2024 July 1st - September 30th) board.
Jul 17 2024, 11:11 AM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review

Jul 16 2024

gmodena moved T365005: Evaluate ESC and explore an alternative design. from In progress to Done on the Data-Engineering (Q1 2024 July 1st - September 30th) board.
Jul 16 2024, 2:13 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Event-Platform
gmodena moved T360968: [Developer Experience] [SPIKE] Investigate process to automate deployment of folders and artifacts to HDFS from In Review to In progress on the Data-Engineering (Q1 2024 July 1st - September 30th) board.
Jul 16 2024, 2:11 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Release-Engineering-Team, Spike
gmodena created P66612 Page Change Reconciliation Event (create action).
Jul 16 2024, 11:50 AM
gmodena created P66609 Page Change Reconciliation Event (move action).
Jul 16 2024, 11:40 AM
gmodena created P66596 Page Change Reconciliation Event (edit action).
Jul 16 2024, 10:00 AM

Jul 11 2024

gmodena added a comment to T363587: [Event Platform] Instrument EventBus with prometheus MW Statslib.

Instrumentation has been enabled in beta. You can test it by modifying a page on https://simple.wikipedia.beta.wmflabs.org.

Jul 11 2024, 9:25 AM · MW-1.43-notes (1.43.0-wmf.22; 2024-09-10), Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board), Event-Platform
gmodena created P66281 eventbus metrics (beta).
Jul 11 2024, 9:23 AM
gmodena updated the task description for T363587: [Event Platform] Instrument EventBus with prometheus MW Statslib.
Jul 11 2024, 9:19 AM · MW-1.43-notes (1.43.0-wmf.22; 2024-09-10), Patch-For-Review, Data-Engineering (Q1 2024 July 1st - September 30th), Dumps 2.0 (Kanban Board), Event-Platform

Jul 3 2024

gmodena added a comment to T368787: Flink job to enrich reconciliation events.

But can we assume the stream will be produced directly into jumbo, and won't have multi dc / replication requirements?

I think we decided we want to do reconcilliation of page_change and page_content_change in general, so that it can be used for Search and others.

Jul 3 2024, 2:00 PM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review, Dumps 2.0 (Kanban Board)
gmodena added a comment to T368787: Flink job to enrich reconciliation events.

Consume this new event stream

Jul 3 2024, 9:09 AM · Data-Engineering (Q1 2024 July 1st - September 30th), Patch-For-Review, Dumps 2.0 (Kanban Board)