pip_parse does not respect requirement markers #1105

arrdem · 2023-03-01T00:48:09Z

🐞 bug report

Affected Rule

The issue is caused by the rule: pip_parse (all versions)

Is this a regression?

It appears to be. Previously, rules_python did evaluate platform markers see nikhaldi@323e84e

In 0.16, python/pip_install/extract_wheels/parse_requirements_to_bzl.py will gladly parse a requirements file which uses markers, but does nothing with them. Nor does the new Starlark parser introduced in ba69aec.

It appears this functionality was simply never implemented in the new parser, and was deleted in 37e7e68 when the legacy pip_import implementation was removed.

Description

rules_python makes no attempt to evaluate requirement marker expressions, which leads to false build failures due to attempting to install packages marked as not viable because pip wheel succeeds with a no-op and an error message when presented with a requirement which will be ignored due to markers.

🔬 Minimal Reproduction

When handed a requirements.txt file which uses platform markers to select a different version of an artifact depending on markers everything appears okay. Consider

matplotlib==3.4.3 ; sys_platform != "darwin" or platform_machine != "arm64"
matplotlib==3.4.3.post1+abnormal ; sys_platform == "darwin" and platform_machine == "arm64"

The correct behavior for this requirements file should be to choose between these two versions. Note that the boolean selector condition is disjoint. However the actual behavior is to unconditionally attempt to install the first requirement clause listed in the file. This breaks both bazel fetch //... and any rules referencing the requirement("matplotlib").

The behavior should be to evaluate all the marker expressions and select matching versions/args.

It may be valid for no versions of a package to be selected. Consider tensorflow-macos==2.7.0 ; sys_platform == "darwin". Arguably this should be achieved with platform-specific lockfiles and requirements_darwin, but it shouldn't be an error.

If multiple requirements are selected, that should likely be an error.

🔥 Exception or Error

ERROR: $REPO/BUILD:1:11: $TARGET depends on @pypi_py38_darwin_arm64_matplotlib//:pkg in repository @pypi_py38_darwin_arm64_matplotlib which failed to fetch. no such package '@pypi_py38_darwin_arm64_matplotlib//': whl_library pypi_py38_darwin_arm64_matplotlib failed:
Ignoring matplotlib: markers 'sys_platform != "darwin" or platform_machine != "arm64"' don't match your environment
 (Traceback (most recent call last):
  File "/Users/arrdem/.pyenv/versions/3.8.16/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Users/arrdem/.pyenv/versions/3.8.16/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/private/var/tmp/_bazel_arrdem/64fd9f51655edad1a031ed416b537465/external/rules_python/python/pip_install/extract_wheels/extract_single_wheel.py", line 105, in <module>
    main()
  File "/private/var/tmp/_bazel_arrdem/64fd9f51655edad1a031ed416b537465/external/rules_python/python/pip_install/extract_wheels/extract_single_wheel.py", line 92, in main
    whl = next(iter(glob.glob("*.whl")))
StopIteration
)

The text was updated successfully, but these errors were encountered:

arrdem · 2023-03-03T18:14:25Z

While implementing support for marker evaluation back on 0.18.1 or before looks quite doable using the underlying pip machinery as before, I don't see a good way to add support for it into the new starlark parser as doing so would require teaching starlark how to both parse and evaluate the marker grammar or putting a Python interpreter back in the parsing flow which would defeat the apparent purpose of the starlark parser.

aignas · 2023-03-13T05:00:50Z

@arrdem, does your usecase work better if you have the os specific requirements files as implemented in #1123 and you keep the platform specific includes in the requirements.in as opposed to requirements.txt file?

mvgijssel · 2023-03-14T12:41:09Z

Just ran into this issue as well trying to install different versions of torch for different architectures (aarch64 vs x86_64)

A requirements lock file that looks like

torch @ https://download.pytorch.org/whl/cpu/torch-1.13.1%2Bcpu-cp39-cp39-linux_x86_64.whl ; platform_machine == "x86_64" and python_full_version == "3.9.10" \
    --hash=sha256:71636a5c21927236f4974d2355fb3f66a0b707c28219b0135ff65ed0f0e61287
torch @ https://download.pytorch.org/whl/torch-1.13.1-cp39-cp39-manylinux2014_aarch64.whl ; platform_machine == "aarch64" and python_full_version == "3.9.10" \
    --hash=sha256:2c3581a3fd81eb1f0f22997cddffea569fea53bafa372b2c0471db373b26aafc

works locally on a M1 Macbook pro with an arm architecture, but fails in the GitLab CI which is using x86_64 with the error

00:01:12 ERROR: /builds/oVEuX7FM/0/predict/BUILD.bazel:201:11: //predict:lib depends on @predict_torch//:pkg in repository @predict_torch which failed to fetch. no such package '@predict_torch//': whl_library predict_torch failed: Ignoring torch: markers 'platform_machine == "aarch64" and python_full_version == "3.9.10"' don't match your environment

Using the latest rules_python

http_archive(
    name = "rules_python",
    sha256 = "ffc7b877c95413c82bfd5482c017edcf759a6250d8b24e82f41f3c8b8d9e287e",
    strip_prefix = "rules_python-0.19.0",
    url = "https://github.com/bazelbuild/rules_python/releases/download/0.19.0/rules_python-0.19.0.tar.gz",
)

arrdem · 2023-03-14T16:31:54Z

does your usecase work better if you have the os specific requirements files as implemented in #1123 and you keep the platform specific includes in the requirements.in as opposed to requirements.txt file?

@aignas No, we have architecture specific backport builds which need to be selected in support of Apple ARM hardware and Amazon Graviton. In working through options for how to fit this into the existing rules_python infrastructure the #1123 per-OS requirements selectors helps, but it's not immediately obvious to me that teaching the pip_parse macro to do os+arch based selection would be a good pattern. It seems better to use markers for that, rather than adding full select() style support for arbitrary (kernel, arch) tuples which is the real need here.

arrdem · 2023-03-15T18:42:31Z

Slept on this a bit - I could see an argument that we should deploy consistent builds of Python artifacts across architectures so we don't have version differences. That'd get us close to being able to just leverage the current requirements_linux= etc. features.

The kicker is that we do have architecture-specific requirements within a given OS. @mvgijssel's example is legitimate, tensorflow for example also has architecture-dependent requirement within a given host platform/kernel. So I don't think we can get away from something markers-equivalent.

arrdem · 2023-05-23T19:06:50Z

Hacked around this by writing a python_config repository rule which can use the repository_ctx with access to the host arch and OS and starlark conditionals to generate a config.bzl template file with appropriate contents, simplifying all the multiple-python/pip setup malarky down to

load("//tools/rules:python_config.bzl", "python_config")

python_config(
    name = "pycfg",
    version = python_version,
)

load("@pycfg//:config.bzl", "environment", "extra_pip_args", "requirement_clusters", "requirements_lock")

pip_parse(
    name = "pypi",
    environment = environment,
    extra_pip_args = extra_pip_args,
    requirement_clusters = requirement_clusters,
    requirements_lock = requirements_lock,
)

load("@pypi//:requirements.bzl", pypi_install_deps = "install_deps")

pypi_install_deps()

OliverFM · 2023-09-26T09:44:50Z

Any progress on this issue @arrdem – I couldn't see if you had merged this in, since it looks like, if it is merged, the name has changed.

If not could you share an example of this kind of rule?
In particular it would be great to change these outputs dependent on whether or not CUDA is present.

github-actions · 2024-03-24T22:50:24Z

This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days.
Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_python!

github-actions · 2024-04-24T22:50:22Z

This issue was automatically closed because it went 30 days without a reply since it was labeled "Can Close?"

keith · 2024-06-17T22:13:21Z

I hit this case too, looks like I could duplicate the requirements files in order to pass different files to requirements_by_platform, but that's a lot of files to keep up to date

aignas · 2024-07-12T08:22:02Z

Note, that there are more comments/discussion on this in #1868.

This is just a small PR to reduce the scope of bazelbuild#2059. Work towards bazelbuild#260, bazelbuild#1105, bazelbuild#1868.

…around This is extra preparation needed for bazelbuild#2059. Summary: - Create `pypi_repo_utils` for more ergonomic handling of Python in repo context. - Split the resolution of requirements files to platforms into a separate function to make the testing easier. This also allows more validation that was realized that there is a need for in the WIP feature PR. - Make the code more robust about the assumption of the target platform label. Work towards bazelbuild#260, bazelbuild#1105, bazelbuild#1868.

…around (#2069) This is extra preparation needed for #2059. Summary: - Create `pypi_repo_utils` for more ergonomic handling of Python in repo context. - Split the resolution of requirements files to platforms into a separate function to make the testing easier. This also allows more validation that was realized that there is a need for in the WIP feature PR. - Make the code more robust about the assumption of the target platform label. Work towards #260, #1105, #1868.

…2068) This is just a small PR to reduce the scope of #2059. This just moves some code from one python file to a separate one. Work towards #260, #1105, #1868.

This also changes the local_runtime_repo to explicitly check for supported platforms instead of relying on a `None` value returned by the helper method. This makes the behaviour exactly the same to the behaviour before this PR and we can potentially drop the need for the validation in the future if our local_runtime detection is more robust. This also makes the platform detectino in `pypi_repo_utils` not depend on `uname` and only use the `repository_ctx`. Apparently the `module_ctx.watch` throws an error if one attempts to watch files on the system (this is left for `repository_rule` it seems and one can only do `module_ctx.watch` on files within the current workspace. This was surprising, but could have been worked around by just unifying code. This splits out things from bazelbuild#2059 and makes the code more succinct. Work towards bazelbuild#260, bazelbuild#1105, bazelbuild#1868.

This also changes the local_runtime_repo to explicitly check for supported platforms instead of relying on a `None` value returned by the helper method. This makes the behaviour exactly the same to the behaviour before this PR and we can potentially drop the need for the validation in the future if our local_runtime detection is more robust. This also makes the platform detectino in `pypi_repo_utils` not depend on `uname` and only use the `repository_ctx`. Apparently the `module_ctx.watch` throws an error if one attempts to watch files on the system (this is left for `repository_rule` it seems and one can only do `module_ctx.watch` on files within the current workspace. This was surprising, but could have been worked around by just unifying code. This splits out things from #2059 and makes the code more succinct. Work towards #260, #1105, #1868.

Before this change the `all_requirements` and related constants will have packages that need to be installed only on specific platforms and will mean that users relying on those constants (e.g. `gazelle`) will need to do extra work to exclude platform-specific packages. The package managers that that support outputting such files now include `uv` and `pdm`. This might be also useful in cases where we attempt to handle non-requirements lock files. Note, that the best way to handle this would be to move all of the requirements parsing code to Python, but that may cause regressions as it is a much bigger change. This is only changing the code so that we are doing extra processing for the requirement lines that have env markers. The lines that have no markers will not see any change in the code execution paths and the python interpreter will not be downloaded. We also use the `*_ctx.watch` API where available to correctly re-evaluate the markers if the `packaging` Python sources for this change. Extra changes that are included in this PR: - Extend the `repo_utils` to have a method for `arch` getting from the `ctx`. - Change the `local_runtime_repo` to perform the validation not relying on the implementation detail of the `get_platforms_os_name`. - Add `$(UV)` make variable for the `uv:current_toolchain` so that we can generate the requirements for `sphinx` using `uv`. - Swap the requirement generation using `genrule` and `uv` for `sphinx` and co so that we can test the `requirement` marker code. Note, the `requirement` markers are not working well with the `requirement_cycles`. Fixes #1105. Fixes #1868. Work towards #260, #1975. Related #1663. --------- Co-authored-by: Richard Levasseur <rlevasseur@google.com>

…thon Before this PR the lockfile would become platform dependent when the `requirements` file would have env markers. This was not caught because we do not have MODULE.bazel.lock checked into the `rules_python` repository because the CI is running against many versions and the lock file is different, therefor we would not be able to run with `bazel build --lockfile_mode=error`. With this change we use the label to `BUILD.bazel` which is living next to the `python` symlink and since the `BUILD.bazel` is the same on all platforms, the lockfile will remain the same. Work towards bazelbuild#1105, bazelbuild#1868.

…thon (#2135) Before this PR the lockfile would become platform dependent when the `requirements` file would have env markers. This was not caught because we do not have MODULE.bazel.lock checked into the `rules_python` repository because the CI is running against many versions and the lock file is different, therefore we would not be able to run with `bazel build --lockfile_mode=error`. With this change we use the label to `BUILD.bazel` which is living next to the `python` symlink and since the `BUILD.bazel` is the same on all platforms, the lockfile will remain the same. Summary * refactor(uv): create a reusable macro for using uv for locking reqs. * test(bzlmod): enable testing the MODULE.bazel.lock breakage across platforms. * test(bzlmod): use a universal requirements file for 3.9. This breaks the CI, because the python interpreter file hash is added to the lock file. * fix(bzlmod): keep the lockfile platform independent when resolving python Fixes #1105 and #1868 for real this time. Implements an additional helper for #1975.

aignas mentioned this issue Mar 13, 2023

fix: update correct requirements lock file when using os specific lock files #1123

Merged

github-actions bot added the Can Close? Will close in 30 days if there is no new activity label Mar 24, 2024

github-actions bot closed this as completed Apr 24, 2024

aignas reopened this Jul 12, 2024

aignas mentioned this issue Jul 12, 2024

feat(pypi): support env markers in requirements files #2059

Merged

github-actions bot removed the Can Close? Will close in 30 days if there is no new activity label Jul 12, 2024

aignas added a commit to aignas/rules_python that referenced this issue Jul 16, 2024

refactor(pypi): split out code for env marker evaluation for reuse

87fd3f5

This is just a small PR to reduce the scope of bazelbuild#2059. Work towards bazelbuild#260, bazelbuild#1105, bazelbuild#1868.

This was referenced Jul 16, 2024

refactor(pypi): split out code for env marker evaluation for reuse #2068

Merged

refactor(pypi): split out more utils and start passing 'abi_os_arch' around #2069

Merged

aignas mentioned this issue Jul 18, 2024

refactor(internal): move the os/arch detection to repo_utils #2075

Merged

aignas self-assigned this Jul 23, 2024

rickeylev closed this as completed in #2059 Aug 15, 2024

aignas mentioned this issue Aug 20, 2024

fix(bzlmod): keep the lockfile platform independent when resolving python #2135

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pip_parse does not respect requirement markers #1105

pip_parse does not respect requirement markers #1105

pip_parse does not respect requirement markers #1105

pip_parse does not respect requirement markers #1105

Comments

🐞 bug report

Affected Rule

Is this a regression?

Description

🔬 Minimal Reproduction

🔥 Exception or Error