User Details
- User Since
- Nov 2 2020, 1:15 PM (203 w, 17 h)
- Availability
- Available
- IRC Nick
- gmodena
- LDAP User
- Gmodena
- MediaWiki User
- GModena (WMF) [ Global Accounts ]
Thu, Sep 19
I resumed working on this now that we have instrumentation for EventBus.
This task has been superseded by {T363587: [Event Platform] Instrument EventBus with prometheus MW Statslib}.
Mon, Sep 16
Fri, Sep 13
We mirror repos (read-only) from Gerrit to GitHub, is there a way we can do the same to Gitlab? That could help soft landing this roll out.
@Snwachukwu might you need inspiration on drafting a contribution policy for the repos your are migrating, maybe these could be of help: https://gitlab.wikimedia.org/repos/data-engineering/eventutilities-python/-/blob/main/CONTRIBUTING.md and https://www.mediawiki.org/wiki/Platform_Engineering_Team/Event_Platform_Value_Stream/Contribution
@Fabfur thanks for the heads up. We'll need to coordinate a bit on roll out, because the previous ingestion and processing infra has been decommissioned as part of T372456: Rollback haproxy feed automated ingestion.
Wed, Sep 11
Are you planning to start shipping to Kafka anytime soon? It would be great to touch base and iron out the timelines in case, just to stay on the same page.
Tue, Sep 10
Mon, Sep 9
Fri, Sep 6
Thu, Sep 5
And page_change_kind? and maybe changelog_kind?
Wed, Sep 4
Ack. Investigating.
My original patch only addressed EventBus. I can't seem to find any new warn entry for that service in logstash.
Tue, Sep 3
Mon, Sep 2
I suppose we should target running Flink reconciled enrichment job in dse-k8s at first anyway, even if we decide we want to run the job multi-DC in wikikube eventually. So let's go for it!
Wed, Aug 28
I don't know why matching on the job label alone did not work, by a more extensive match (all labels except instace, that is empty) did the trick:
@colewhite ack. Thanks for the heads up.
Tue, Aug 27
f/up from a convo with @fgiunchedi in IRC.
Mon, Aug 26
Aug 22 2024
Aug 20 2024
The alert compares the ratio of flink_taskmanager_job_task_operator_event_process_function_events_out_total over flink_taskmanager_job_task_operator_event_process_function_events_in_total.
I did some investigation both on the affected Kafka topics and related hive event tables. Quick update with my findings so far.
Remove the Gobblin MapReduce job that loads Kafka topics.
Aug 19 2024
Aug 16 2024
Aug 15 2024
I'd also like to do 2 more things before we resolve this ticket:
Aug 14 2024
Ack. Thanks for the heads up and CRs @fgiunchedi
Jul 19 2024
Jul 18 2024
When gobblin moves to airflow, we can use Artifact sync to deploy the jar, instead of relying on analytics/refinery.
A dashboard for EventBus is available in Grafana
Jul 17 2024
After re-scoping both Config Store and MPIC, we decided to not move forward with refactoring ESC at this stage. A summary of this task is available on wikitech. A design doc is available at Stream Registry.
DQ job and airflow dag have been updated. Deployment requires a new release of refinery-source and a dag deployment on analytics.
DQ job and airflow dag have been implemented. Deployment requires a new release of refinery-source and a dag deployment on analytics.
Jul 16 2024
Jul 11 2024
Instrumentation has been enabled in beta. You can test it by modifying a page on https://simple.wikipedia.beta.wmflabs.org.
Jul 3 2024
Consume this new event stream