PHPUnit data providers should ideally return plain data, for the reasons explained in T332865. On top of that, they run so early that accessing MediaWikiServices is potentially unsafe, and ditto for globals. Therefore, we should disallow use of MediaWikiServices::getInstance() and global variables in data providers, just like we do in unit tests. This could be done at the end of bootstrap.php, and then access could be re-enabled in a @beforeClass in MediaWikiIntegrationTestCase.
As an additional benefit, this would allow us to catch (some) tests that depend on MediaWiki but are not extending MediaWikiIntegrationTestCase, which are also unsafe because they might be accessing the real wiki database etc.
Note that it may be possible to implement this more elegantly in PHPUnit 10, thanks to the event system introduced in that version. However, I would suggest resolving this task before the upgrade, and improving the implementation later: as noted in T332865, non-static data providers have been deprecated in PHPUnit 10; additionally, trying to return complex object structures from a data provider will massively slow the test runner down because it tries to stringify them (see more details in T328919). Since there is a correlation between non-static data providers and MWServices access, resolving this task would most likely help us with the PHPUnit 10 upgrade.