[go: up one dir, main page]

Page MenuHomePhabricator

Deepcat SPARQL Exception: HTTP error: * HTTP request timed out.* There was a problem during the HTTP request: 0 Error
Closed, ResolvedPublic2 Estimated Story PointsPRODUCTION ERROR

Description

Error
normalized_message
Deepcat SPARQL Exception: HTTP error: * HTTP request timed out.
* There was a problem during the HTTP request: 0 Error
exception.trace
Impact
Notes
Repro

Details

Request URL
https://en.wikipedia.org/w/index.php?advancedSearch-current=*&fulltext=*&ns0=*&profile=*&search=*&title=*

Event Timeline

This doesn't look to be a generic problem, rather a problem very specific to Category:People. This category has 303,848 sub-categories (out of ~2.2M total categories in enwiki) within a depth of 5. Notably this means any results that came from deepcat:People even if the sparql query didn't timeout would be highly incomplete. deepcat is limited to search 256 actual categories at a time. Even if it worked deepcat:People would search less than 0.1% of related categories.

Perhaps we should recognize the timeout and give a slightly better message. Mostly it will say that the category is incompatible with deepcat.

Change 833462 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[mediawiki/extensions/CirrusSearch@master] deepcat: Improve error message on timeout

https://gerrit.wikimedia.org/r/833462

Patch gives a better error message. I spent some time digging into if we could have a better method of detecting these errors, but it ends up being a bit of a rabbit hole. SparqlClient sets the http client timeout and the backend sparql timeout to the same value, almost ensuring we get an http timeout. This is relatively easy to adjust such that we get a timeout from the backend, but that doesn't end up playing nicely with the http client used. In particular when the backend timeout is hit sparql issues a 500 error with a text/plain response that includes the source query along with the exception and a stack trace. The http client library used throws the response content away and gives SparqlClient only the 500 error which isn't specifically actionable.

For now the best course of action seems to be to leave the existing timeout handling which gets the http client to timeout first, and then adjust our warnings based on detecting the client timeout.

Change 833462 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@master] deepcat: Improve error message on timeout

https://gerrit.wikimedia.org/r/833462