I do a call to the API like this one:
And I had two problems with it:
From the logfile of my script, times are UTC
* 2019-09-25 09:42:05 Process Page Id:365531 Rev Id:192573058 Timestamp:20190925093700 Title:Liste der konsularischen Vertretungen in Hamburg
This call returned the content of the previous revision along with the revision id and the timestamp of the newest revision. At least, thats what it looked like, when analyzing the data created by my script.
On that day, the replag was very high, I wrote in the chat the following line:
[09:40:35] <Wurgl> https://tools.wmflabs.org/replag/ <-- 5 hours lag on enwiki? 7 hours on wikidata, others have 2 and 3 hours? What's going on here?
So I changed my code and retrieve the sha1-checksum too, compute that checksum in my script and compare it. No problem for about a week, then suddly my logfile shows another similar case:
2019-10-03 11:18:30 Process Page Id:7645820 Rev Id:192815153 Timestamp:20191003111824 Title:MTV Eintracht Celle 2019-10-03 11:18:30 *** SHA1 does not match computed: da39a3ee5e6b4b0d3255bfef95601890afd80709 API: 628e76e06f3d101ee48121a03cfd10195f9dd784
So here the API returned in one single call a content and a sha1-checksum which did not match. Even worse, the revision table does not hold any of these two checksums, so something is odd here.
Something is mixed up here. The returned sha1-checksum shall always match the content and the content shall always be the one of the reported timestamp and revision id.