Investigate a PHP segmentation fault
Summary
- Install debugging packages: apt-get -y install php7.4-common-dbgsym php7.4-cli-dbgsym
- curl -o ~/php-gdbinit https://raw.githubusercontent.com/php/php-src/php-7.4.30/.gdbinit
- gdb <your php command>
- Enter run then once the command has failed: bt and zbacktrace
CI: Get notified immediately when a job fails
If you've submitted patches for MediaWiki core, skins or extensions, you've seen this output in Gerrit:
Shrinking H2 database files
Our code review system Gerrit has several caches, the largest ones being backed up on disk. The disk caches offload memory usage and persist the data between restarts. As a Java application, the caches are stored in H2 database files and I recently had to find how to connect to them in order to inspect their content and reduce their size.
scap backport Makes Deployments Easy
Mediawiki developers, have you ever thought, “I wish I could deploy my own code for Mediawiki”? Now you can! More deploys! More fun!
Production Excellence #46: July & August 2022
How are we doing in our strive for operational excellence? Read on to find out!
Production Excellence #45: June 2022
How are we doing in our strive for operational excellence? Read on to find out!
Production Excellence #44: May 2022
How’d we do in our strive for operational excellence last month? Read on to find out!
GitLab-a-thon!
Release Engineering's "GitLab-a-thon" sprint for May 10th-24th (roughly) focused on the mechanics of migrating a Wikimedia service to GitLab, setting up a CI pipeline, building container images from that service, and publishing images to the Wikimedia registry. We selected the Blubber project as a good candidate for experimentation:
Production Excellence #43: April 2022
How’d we do in our strive for operational excellence last month? Read on to find out!
Production Excellence #42: March 2022
How’d we do in our strive for operational excellence last month? Read on to find out!
What We Learned from Trainsperiment Week
Developers should own the process of putting their code into production. They should decide when to deploy, monitor their deployment, and make decisions about rollback.
A Trainsperiments Week Reflection
Over here in the Release-Engineering-Team, Train Deployment is usually a rotating duty. We've written about it before, so I won't go into the exact process, but I want to tell you something new about it.
Production Excellence #41: February 2022
How’d we do in our strive for operational excellence last month? Read on to find out!
GitLab: Rethinking how we handle access control
I'll start with a bit of general administrivia. First, our migration of Wikimedia code review & CI to GitLab continues, and we're mindful that people could use regular updates on progress. Second, I need to think through some stuff about the project, and doing that in writing is helpful for all involved. I'm going to try writing occasional blog entries here for both purposes.
Diving Into Our Deployment Data
If you’ve ever experienced the pride of seeing your name on MediaWiki's contributor list, you've been involved in our deployment process (whether you knew it or not).
Production Excellence #40: January 2022
How’d we do in our strive for operational excellence last month? Read on to find out!
Production Excellence #39: December 2021
How’d we do in our strive for operational excellence last month? Read on to find out!
Production Excellence #38: November 2021
How’d we do in our strive for operational excellence last month? Read on to find out!
Production Excellence #37: October 2021
How’d we do in our strive for operational excellence last month? Read on to find out!
Benchmarking MediaWiki with PHPBench
This post gives a quick introduction to a benchmarking tool, phpbench, ready for you to experiment with in core and skins/extensions.[1]
Production Excellence #36: September 2021
How’d we do in our strive for operational excellence last month? Read on to find out!
How we deploy code
Last week I spoke to a few of my Wikimedia Foundation (WMF) colleagues about how we deploy code—I completely botched it. I got too complex too fast. It only hit me later—to explain deployments, I need to start with a lie.
Production Excellence #35: August 2021
How’d we do in our strive for operational excellence last month? Read on to find out!
Production Excellence #34: July 2021
How’d we do in our strive for operational excellence last month? Read on to find out!
Production Excellence #33: June 2021
How’d we do in our strive for operational excellence last month? Read on to find out!
Shrinking the tasks backlog
The release engineering team triages tasks flagged Release-Engineering-Team on a weekly basis. It is an all hands on deck one hour meeting in which we pick tasks one by one and find out what to do with them. We have started with more than a hundred of them and are now down to just a dozen or so, most filed since the last meeting.
Production Excellence #32: May 2021
How’d we do in our strive for operational excellence last month? Read on to find out!
Production Excellence #31: April 2021
How’d we do in our strive for operational excellence last month? Read on to find out!
Production Excellence #30: March 2021
How’d we do in our strive for operational excellence last month? Read on to find out!
Tracking memory issue in a Java application
One of the critical pieces of our infrastructure is Gerrit. It hosts most of our git repositories and is the primary code review interface. Gerrit is written in the Java programming language which runs in the Java Virtual Machine (JVM). For a couple years we have been struggling with memory issues which eventually led to an unresponsive service and unattended restarts. The symptoms were the usual ones: the application responses being slower and degrading until server side errors render the service unusable. Eventually the JVM terminates with:
Production Excellence #29: February 2021
How’d we do in our strive for operational excellence last month? Read on to find out!
Production Excellence #28: January 2021
How’d we do in our strive for operational excellence last month? Read on to find out!
Production Excellence #27: December 2020
How’d we do in our strive for operational excellence last month? Read on to find out!
Production Excellence #26: November 2020
How’d we do in our strive for operational excellence last month? Read on to find out!
Runnable runbooks
Recently there has been a small effort on the Release-Engineering-Team to encode some of our institutional knowledge as runbooks linked from a page in the team's wiki space.
Production Excellence #25: October 2020
How’d we do in our strive for operational excellence last month? Read on to find out!
Production Excellence #24: September 2020
How’d we do in our strive for operational excellence last month? Read on to find out!
CI now updates your deployment-charts
If you're making changes to a service that is deployed to Kubernetes, it sure is annoying to have to update the helm deployment-chart values with the newest image version before you deploy. At least, that's how I felt when developing on our dockerfile-generating service, blubber.
Production Excellence #23: July & August 2020
How’d we do in our strive for operational excellence last month? Read on to find out!
Production Excellence #22: June 2020
How’d we do in our strive for operational excellence last month? Read on to find out!
Faster source code fetches thanks to git protocol version 2
In 2015 I noticed git fetches from our most active repositories to be unreasonably slow, sometimes up to a minute which hindered fast development and collaboration. You can read some of the debugging details I have conducted at the time on T103990. Gerrit upstream was aware of the issue and a workaround was presented though we never went to implement it.
Production Excellence #21: May 2020
How’d we do in our strive for operational excellence last month? Read on to find out!
Celebrating 600,000 commits for Wikimedia
Earlier today, the 600,000th commit was pushed to Wikimedia's Gerrit server. We thought we'd take this moment to reflect on the developer services we offer and our community of developers, be they Wikimedia staff, third party workers, or volunteers.
Production Excellence #20: April 2020
How are we doing on that strive for operational excellence during these unprecedented times?
Production Excellence #19: February 2020
How’d we do in our strive for operational excellence last month? Read on to find out!
Production Excellence #18: January 2020
How’d we do in our strive for operational excellence last month? Read on to find out!
Production Excellence #17: December 2019
How’d we do in our strive for operational excellence in November and December? Read on to find out!
Production Excellence #16: October 2019
How’d we do in our strive for operational excellence last month? Read on to find out!
Production Excellence #15: September 2019
How’d we do in our strive for operational excellence last month? Read on to find out!
Integrating code coverage metrics with your development workflow
In Changes and improvements to PHPUnit testing in MediaWiki, I wrote about efforts to help speed up PHPUnit code coverage generation for local development.[0] While this improves code coverage generation time for local development, it could be better.
Introducing Phatality
This past week marks the release of a little tool that I've been working on for a while. In fact, it's something I've wanted to build for more than a year. But before I tell you about the solution, I need to describe the problem that I set out to solve.
Production Excellence #14: August 2019
How’d we do in our strive for operational excellence in August? Read on to find out!
Production Excellence #13: July 2019
How’re we doing on that strive for operational excellence? Read this first anniversary edition to find out!
Production Excellence #12: June 2019
How’d we do in our strive for operational excellence last month? Read on to find out!
Changes and improvements to PHPUnit testing in MediaWiki
Building off the work done at the Prague Hackathon (T216260), we're happy to announce some significant changes and improvements to the PHP testing tools included with MediaWiki.
Production Excellence #11: May 2019
How’d we do in our strive for operational excellence last month? Read on to find out!
Production Excellence #10: April 2019
How’d we do in our strive for operational excellence last month? Read on to find out!
Introducing the codehealth pipeline beta
After many months of discussion, work and consultation across teams and departments[0], and with much gratitude and appreciation to the hard work and patience of @thcipriani and @hashar, the Code-Health-Metrics group is pleased to announce the introduction of the code health pipeline. The pipeline is currently in beta and enabled for GrowthExperiments, soon to be followed by Notifications, PageTriage, and StructuredDiscussions. (If you'd like to enable the pipeline for an extension you maintain or contribute to, please reach out to us via the comments on this post.)
Production Excellence #9: March 2019
How’d we do in our strive for operational excellence last month? Read on to find out!
Quibble hibernated, it is time to flourish
Writing blog is neither my job nor something that I enjoy, I am thus late in the Quibble updates. The last one Blog Post: Quibble in summer has been written in September 2018 and I forgot to publish it until now. You might want to read it first to get a glance about some nice changes that got implemented last summer.
Quibble in summer
Note: this post has been published on 03/28 but has been originally written in September 2018 after Quibble 0.0.26 and never got published.
CI working group report, with recommendations of new tools to try
The working group to consider future CI tooling for Wikimedia has finished and produced a report. The report is at https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/CI_Futures_WG/Report and the short summary is that the release engineering team should do prototype implementations of Argo, GitLab CI/CD, and Zuul v3.
Production Excellence #8: February 2019
How’d we do in our strive for operational excellence? Read on to find out!
Help my CI job fails with exit status -11
For a few weeks, a CI job had PHPUnit tests abruptly ending with:
Work progresses on CI tool evaluation
The working group to consider future tooling for continuous integration is making progress (see previous blog post J148 for more information). We're looking at and evaluating alternatives and learning of new needs within WMF.
Choosing tools for continuous integration
The Release Engineering team has started a working group to discuss and consider our future continuous integration tooling. Please help!
Production Excellence #7: January 2019
How’d we do in our strive for operational excellence last month? Read on to find out!
Production Excellence #6: December 2018
How’d we do in our strive for operational excellence last month? Read on to find out!
Gerrit now automatically adds reviewers
Code Health Metrics and SonarQube
- Code Health
Production Excellence #5: November 2018
How’d we do in our strive for operational excellence last month? Read on to find out!
Production Excellence #4: October 2018
How’d we do in our strive for operational excellence last month? Read on to find out!
Incident Documentation: An Unexpected Journey
The Release Engineering team wants to continually improve the quality of our software over time. One of the ways in which we hoped to do that this year is by creating more useful Selenium smoke tests. (From now on, test will be used instead of Selenium test.) This blog post is about how we determined where the tests should focus and the relative priority.
Bring in 'da noise, bring in defunct. It's a zombie party!
Halloween is a full two weeks behind us here in the United States, but it's still on my mind. It happens to be my favorite holiday, and I receive it both gleefully and somberly.
Wikimedia Release Engineering's 1st Annual Developer Satisfaction Survey
Production Excellence #3: September 2018
How’d we do in our strive for operational excellence last month? Read on to find out!
An introduction to Task Types in Phabricator
This blog post will describe a bit about how we are utilizing the "Task Types" feature in Phabricator to facilitate better tracking of work and to streamline workflows with custom fields. Additionally, I will be soliciting feedback about potential use-cases which could potentially take further advantage of this feature.
mediawiki_selenium 1.8.1 Ruby Gem Released
It has been a while since the last mediawiki_selenium release! 💎
Quibble in May
[Quibble] is the new test runner for MediaWiki (see the intro Blog Post: Introducing Quibble). This post is to give an update of what happened during May 2018.
Technical Debt - The Contagion Effect
One particularly interesting topic discussed during the Hackathon Technical Debt session (T194934) was that of the contagious aspect of technical debt. Although this makes sense in hindsight, it's not something that I had really given much thought to previously.
Run Selenium tests using Quibble and Docker
Introducing Quibble
Running all tests for MediaWiki and matching what CI/Jenkins is running has been a constant challenge for everyone, myself included. Today I am introducing Quibble, a python script that clone MediaWiki, set it up and run test commands.
Selenium tests in Node.js project retrospective
I have been working on the project with more or less focus on it since 2015. Maybe the easiest way to follow the project is by taking a look at a few epic tasks:
Phabricator Updates for February 2018
This is a digest of the updates from several weeks of changelogs which are published upstream. This is an incomplete list as I've cherry-picked just the changes which I think will be of significant interest to end-users of Wikimedia's phabricator. Please see the upstream changelogs for a detailed overview of everything that's changed recently.
Tech talk: Selenium tests in Node.js
Željko Filipin, Engineer (Contractor) from Release Engineering team. That's me! 👋
Selenium Ruby framework deprecation (September)
Originally an email sent on September 25 2017 to qa, engineering and wikitech-l mailing lists.
Selenium Ruby framework deprecation
Originally an email sent on August 23 2017 to qa, engineering and wikitech-l mailing lists.
Selenium tests in Node.js
Originally an-email sent on April 3 2017 to qa, engineering and wikitech-l mailing lists.
New feature: Embed videos from Commons into Phabricator markup
I just finished deploying an update to Phabricator which includes a simple but rather useful feature:
Sponsored Phabricator Improvements
In T135327, the WMF Technical Collaboration team collected a list of Phabricator bugs and feature requests from the Wikimedia Developer Community. After identifying the most promising requests from the community, these were presented to Phacility (the organization that builds and maintains Phabricator) for sponsored prioritization.
Code Review Office Hours
Starting Thursday May 12th, 13:00 PDT ( 20:00 GMT ) we will be having the first weekly Code Review office hours on freenode IRC in the #wikimedia-codereview channel.
What's new: Lots of improvements on phabricator.wikimedia.org
Not a lot has changed for Wikimedia's instance of Phabricator over the past few months. That's because a lot has been happening behind the scenes, as well as upstream at Phacility. Members of the Release-Engineering-Team and Team-Practices group have been working since December 2015 to integrate various upstream changes, however, nothing was released to our production instance because there were so many important features that were in-progress and not yet fully usable. Additionally, we had to figure out exactly how these features would fit with the specific needs of our project and test a lot of functionality to be sure that we would not break anyone's workflows.
Occasional updates from the Release-Engineering-Team