Happening since 2016-11-13:
https://logstash.wikimedia.org/goto/1accd04515820244e3afc443d7fdeed0
{ "_index": "logstash-2016.12.01", "_type": "mediawiki", "_id": "AVi5awFtNPFiRo8stLUd", "_score": null, "_source": { "message": "PageAssessmentsBody::insertRecord\t10.64.32.27\t1062\tDuplicate entry '42117544-47' for key 'PRIMARY' (10.64.32.27)\tINSERT INTO `page_assessments` (pa_page_id,pa_project_id,pa_class,pa_importance,pa_page_revision) VALUES ('42117544','47','C','Unknown','751323561')", "@version": 1, "@timestamp": "2016-12-01T08:05:46.000Z", "type": "mediawiki", "host": "mw1165", "level": "ERROR", "tags": [ "syslog", "es", "es" ], "channel": "DBQuery", "normalized_message": "{fname}\t{db_server}\t{errno}\t{error}\t{sql1line}", "url": "/rpc/RunJobs.php?wiki=enwiki&type=refreshLinks&maxtime=30&maxmem=300M", "ip": "127.0.0.1", "http_method": "POST", "server": "127.0.0.1", "referrer": null, "wiki": "enwiki", "mwversion": "1.29.0-wmf.3", "reqId": "d111c21a32b3ab1725b3c415", "db_server": "10.64.32.27", "db_name": "enwiki", "db_user": "wikiuser", "method": "Database::reportQueryError", "errno": 1062, "error": "Duplicate entry '42117544-47' for key 'PRIMARY' (10.64.32.27)", "sql1line": "INSERT INTO `page_assessments` (pa_page_id,pa_project_id,pa_class,pa_importance,pa_page_revision) VALUES ('42117544','47','C','Unknown','751323561')", "fname": "PageAssessmentsBody::insertRecord" }, "fields": { "@timestamp": [ 1480579546000 ] }, "sort": [ 1480579546000 ] }
A well-created code should check for those in advance. If records can be deleted concurrently, run SELECT FOR UPDATE within a transaction to make sure it is consistent. If those are not real errors, we could use INSERT IGNORE or replace, but checking within or outside a transaction would be preferred.
This is causing log spam and makes more difficult debug problems.