Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-11275

Support GCS files for extra_requirements argument in Python Beam portable runners

Details

    • Improvement
    • Status: Triage Needed
    • P2
    • Resolution: Fixed
    • None
    • 2.35.0
    • sdk-py-core
    • None

    Description

      Currently Portable runners only support locally available files for adding dependencies on remote workers. This can be seen in https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/stager.py#L429 as it uses shutil.copyfile when it detects file is remote and it is not http.

      An easy extension would be to extend _is_remote_path in Stager to detect if the path matches any filesystem and if it does the avoid downloading and let it be copied afterwards.

      Acceptance criteria:

      • `extra_package` can be a GCS path instead of requiring it to be local only.

      Attachments

        Issue Links

          Activity

            People

              calvinleungyk Calvin Leung
              gcasassaez Gerard Casas Saez
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 18h 10m
                  18h 10m