[go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: Bump minimum needed R version to R 4.0 across the ecosystem? #401

Open
IndrajeetPatil opened this issue May 11, 2024 · 68 comments · Fixed by #408
Open

Discussion: Bump minimum needed R version to R 4.0 across the ecosystem? #401

IndrajeetPatil opened this issue May 11, 2024 · 68 comments · Fixed by #408

Comments

@IndrajeetPatil
Copy link
Member
IndrajeetPatil commented May 11, 2024

I think this will be a good idea for a few reasons:

  • It has been 5 years since R 3.6 came out (2019-04-26); that's enough time for people to have upgraded to a newer R version.

  • An increasing number of dependencies are using the base pipe (introduced in R 4.0), which means the new versions of dependencies can't even be installed on R 3.6. If you haven't noticed that, it's because I have been carving out exceptions for them in the workflows.

  • Very few of our test suites pass on R 3.6 (e.g. datawizard fails), and some don't even run (e.g. see because the graphics engine for R 3.6 is not supported by vdiffr).

  • It's holding us back from using the base pipe, which now both us (as developers) and the users have got used to using.

All of this combined makes it quite a hurdle to keep supporting this R version. Even the tidyverse no longer supports it:

image
@bwiernik
Copy link
Contributor

Let's do it!

@vincentarelbundock
Copy link

Frankly, I don't see the benefit. Only Suggests packages require it, and carve outs for those are easy.

Over the past few months, I've heard from several users still on 3.6 because of (bad) institutional policies and it would be a shame that leave them behind.

If there was a compelling computational or language reason, sure. But we can use base pipe in the website vignettes, without enforcing version 4.

That said, my view on this is not deeply held. It's not a super big deal, and I will of course support the majority.

@mattansb
Copy link
Member

Generally I am pro native pipe (:

But

  • The tidyverse still does support 3.6 (dplyr even supports 3.5).

  • How would reverse depends/suggests/imports be affected? We want to allow devs to continuously rely on us...

@vincentarelbundock
Copy link
  • How would reverse depends/suggests/imports be affected? We want to allow devs to continuously rely on us...

This would break marginaleffects and at least 3 or 4 packages that rely on marginaleffects, forcing several developers to update their packages.

Many of the packages that depend on insight allow earlier versions, and they would all need to be updated by their maintainers.

I would hate to ask people to work on this without giving them any tangible benefit.

Yeah, I think this is not a great move.

@IndrajeetPatil
Copy link
Member Author

The tidyverse still does support 3.6 (dplyr even supports 3.5).

Well, they don't check on those R versions in their CI. So, for all intent and purposes, they don't support it. They are just waiting for someone to complain (cf. tidyverse/purrr#1045) before the version is bumped.

That's not the user experience I want our users to have. This is why I have still retained R 3.6 checks in our CI. Without checking that the package works and tests pass in CI, we might as well support R 2.0 🤷

We want to allow devs to continuously rely on us

That's not a compelling enough reason for me. There is too much lethargy among developers to bump R version because some cluster on some university's server might still be using it; guess what, they might still be using R 3.0.

Those users can continue to use the older versions of our packages from the archive. And waiting for 5 years is not being too aggressive. Matrix already has bumped to R > 4.4, emmeans to R > 4.1, etc. We have been quite conservative in this regard.


That said, if you want to continue to support R 3.6, then you should also make sure that all tests pass on this R version in our CI. Otherwise, we are just making empty promises.

@IndrajeetPatil
Copy link
Member Author

This would break marginaleffects and at least 3 or 4 packages that rely on marginaleffects, forcing several developers to update their packages.

@vincentarelbundock Can you clarify what do you mean by "break" here? Because CRAN doesn't check packages on R 3.6.

@vincentarelbundock
Copy link

Matrix is not bumped to 4.4. see the thread on R-devel. The number indicated on CRAN website is the version of R in which Matrix was compiled.

@vincentarelbundock
Copy link

@vincentarelbundock Can you clarify what do you mean by "break" here? Because CRAN doesn't check packages on R 3.6.

I mean that developers who import an easystats package will no long be able to support the users they want to support, and will have to release a new package indicating new version requirements.

@IndrajeetPatil
Copy link
Member Author

Matrix is not bumped to 4.4. see the thread on R-devel. The number indicated on CRAN website is the version of R in which Matrix was compiled.

Are you sure? Because the entire CI of easystats is currently in turmoil because of this issue:

  ! Could not solve package dependencies:
  * deps::.: Can't install dependency BayesFactor
  * BayesFactor: Can't install dependency MatrixModels
  * MatrixModels: Can't install dependency Matrix (>= 1.6-0)
  * Matrix: Needs R >= 4.5
  * Matrix: Needs R >= 4.4.0
Screenshot 2024-05-11 at 19 33 27

@vincentarelbundock
Copy link

See this thread and in particular MM's post (but others too): https://stat.ethz.ch/pipermail/r-devel/2024-April/083377.html

@IndrajeetPatil
Copy link
Member Author

I mean that developers who import an easystats package will no long be able to support the users they want to support, and will have to release a new package indicating new version requirements.

Yes, but what if these developers wish to continue to support this user base for the next 5 years? How long are we supposed to indulge them? What about the maintenance cost and burden involved in supporting such legacy versions that probably not even 1% of the entire user base is still on?

We are basically condemning ourselves to not benefit from any improvements in R for at least 5–7 years at minimum.

@IndrajeetPatil
Copy link
Member Author

I would hate to ask people to work on this without giving them any tangible benefit.

The tangible benefit are improvements in syntax in the language itself, not having to backport newer functions and so reduced maintenance overhead, improvements in performance, etc.

@IndrajeetPatil
Copy link
Member Author

This is exactly why I wanted us as an organization to come up with a policy for R version support, which we still haven't done. This means I always have to be the bad cop and bring up this topic, and I need to because I am the one primarily maintaining the CI infrastructure.

#295

If we come up with some policy and assure some guarantees to developers using easystats about for how long we will support some R versions, we need not go through this routine every six months.

@vincentarelbundock
Copy link

My argument is that there have been no "must have" feature added to R in a loooong time.

My view is that supporting old versions is essentially costless, and that we should keep supporting them "forever", or until a really compelling new feature is added to R that truly makes developers' lives considerably easier. I don't see any such new feature in the last 10 years or so.

(Again, we can use base pipe on the website.)

You ask "why shouldn't we support even older versions?" I answer "no" because I don't want to do that work. But status quo is free.

I really didn't mean for this to devolve into a long debate, so will let others discuss and decide.

@IndrajeetPatil
Copy link
Member Author

My view is that supporting old versions is essentially costless

No, it's not.

The longer we go on like this, the more and more number of R versions we need to check:

          - { os: ubuntu-latest, r: "devel" }
          - { os: ubuntu-latest, r: "release" }
          - { os: ubuntu-latest, r: "oldrel-1" }
          - { os: ubuntu-latest, r: "oldrel-2" }
          - { os: ubuntu-latest, r: "oldrel-3" }
          - ... # other R versions
          - { os: ubuntu-latest, r: "3.6" }

It is an enormous effort to make sure ten different R packages continue to work on legacy R versions. This is because of the breaking changes that are introduced in every major or minor release (e.g., change in serialization format, change in random number generation, stringAsFactors behaviour, etc.), and tests need to be adjusted for the before and after behaviour, or they at least need to be skipped appropriately. This is a ton of work.

@IndrajeetPatil
Copy link
Member Author
IndrajeetPatil commented May 11, 2024

Yeah, I am convinced it's a bad idea to continue to support R < 4.0, and manually maintain this freakshow for years to come:

      - uses: r-lib/actions/setup-r-dependencies@v2
        with:
          pak-version: devel
          upgrade: "TRUE"
          cache-version: 8
          extra-packages: |
            any::rcmdcheck
            any::BH
            any::RcppEigen
            BayesFactor=?ignore-before-r=100.0.0
            car=?ignore-before-r=100.0.0
            Matrix=?ignore-before-r=100.0.0
            MatrixModels=?ignore-before-r=100.0.0
            lme4=?ignore-before-r=100.0.0
            quantreg=?ignore-before-r=100.0.0
            TMB=?ignore-before-r=100.0.0
            ivprobit=?ignore-before-r=100.0.0
            mhurdle=?ignore-before-r=100.0.0
            brms=?ignore-before-r=4.3.0
            estimability=?ignore-before-r=4.3.0
            effects=?ignore-before-r=4.3.0
            nestedLogit=?ignore-before-r=4.3.0
            FactoMineR=?ignore-before-r=4.3.0
            factoextra=?ignore-before-r=4.3.0
            emmeans=?ignore-before-r=4.3.0
            bayesQR=?ignore-before-r=4.2.0
            MuMIn=?ignore-before-r=4.2.0
            ape=?ignore-before-r=4.1.0
            car=?ignore-before-r=4.1.0
            drc=?ignore-before-r=4.1.0
            EGAnet=?ignore-before-r=4.1.0
            ggpubr=?ignore-before-r=4.1.0
            Hmisc=?ignore-before-r=4.1.0
            mediation=?ignore-before-r=4.1.0
            rstatix=?ignore-before-r=4.1.0
            PROreg=?ignore-before-r=4.1.0
            rmcorr=?ignore-before-r=4.1.0
            rms=?ignore-before-r=4.1.0
            randomForest=?ignore-before-r=4.1.0
            pbkrtest=?ignore-before-r=4.1.0
            afex=?ignore-before-r=4.1.0
            car=?ignore-before-r=4.1.0
            ICS=?ignore-before-r=4.1.0
            ivreg=?ignore-before-r=4.1.0
            AER=?ignore-before-r=4.1.0
            WRS2=?ignore-before-r=4.1.0
            tinytable=?ignore-before-r=4.1.0
            survey=?ignore-before-r=4.1.0
            epiR=?ignore-before-r=4.0.0
            energy=?ignore-before-r=4.0.0
            gam=?ignore-before-r=4.0.0
            gdtools=?ignore-before-r=4.0.0
            flextable=?ignore-before-r=4.0.0
            ftExtra=?ignore-before-r=4.0.0
            gsl=?ignore-before-r=4.0.0
            fastICA=?ignore-before-r=4.0.0
            metafor=?ignore-before-r=4.0.0
            metadat=?ignore-before-r=4.0.0
            metaplus=?ignore-before-r=4.0.0
            multgee=?ignore-before-r=4.0.0
            panelr=?ignore-before-r=4.0.0
            ICSOutlier=?ignore-before-r=4.0.0
            multimode=?ignore-before-r=4.0.0
            jtools=?ignore-before-r=4.0.0
            mmrm=?ignore-before-r=4.0.0
            rsvd=?ignore-before-r=4.0.0
            sparsepca=?ignore-before-r=4.0.0
            qqconf=?ignore-before-r=4.0.0
            qqplotr=?ignore-before-r=4.0.0
            rtdists=?ignore-before-r=4.0.0
            VGAM=?ignore-before-r=4.0.0
            ggside=?ignore-before-r=4.0.0
          needs: check

@mattansb
Copy link
Member

My view is that supporting old versions is essentially costless

No, it's not.

I agree with this.

I still think we should give a proper heads-up to maintainers of packages that use easystats about this change.

(FWIW @IndrajeetPatil I don't think you're the bad cop - you're the righteous cop!)

@IndrajeetPatil
Copy link
Member Author

I still think we should give a proper heads-up to maintainers of packages that use easystats about this change.

Having a stated policy about supported R versions on the website is the heads-up. This is also what tidyverse does: they don't email every maintainer before they bump R versions; they just need to respect their policy.

@mattansb
Copy link
Member

🤷‍♂️

Let's do then!

@IndrajeetPatil
Copy link
Member Author

We're trying to do it since Sept'22 😭

#295

@bwiernik
Copy link
Contributor
bwiernik commented May 12, 2024

I agree. For precedent, tidyverse's declared policy is also to support the current version and 4 previous versions https://www.tidyverse.org/blog/2019/04/r-version-support/

So that would be currently back to R 4.0. While earlier versions may be supported on individual packages, they make no attempt to maintain support there and will bump the version up if an older version breaks anything.

@IndrajeetPatil
Copy link
Member Author

The new version of {evaluate} requires R >= 4.0: https://cran.rstudio.com/web/packages/evaluate/index.html
This means that {knitr}, {rmarkdown}, and {testthat} all requires R 4.0.

Since neither the vignette builder nor the testing framework is now available for R 3.6, there is no point in including it in CI.

I am removing R 3.6 from CI, and also bumping minimum required version in {see} to R 4.0. Other maintainers can decide for their repos.

@strengejacke
Copy link
Member

What will the tidyverse packages do? All increase to 4.0?

IndrajeetPatil added a commit to easystats/workflows that referenced this issue Jun 11, 2024
@IndrajeetPatil
Copy link
Member Author

They don't test (via CI) the oldest R version they claim to support (they test until oldrel-4, whichever that might be).

So nothing will fail for their CI; they might bump the minimum required R version when a user points it out or complains.

@strengejacke
Copy link
Member

ok, so they haven't changed anything according to their support policies.

@IndrajeetPatil
Copy link
Member Author

No, they don't immediately change their DESCRIPTION files to reflect the de facto minimum required R version.

@etiennebacher
Copy link
Member

So basically our policy is now outdated, right?

image

We should support 3.6 but can't anymore due to this new requirement

@etiennebacher etiennebacher reopened this Jun 21, 2024
@etiennebacher
Copy link
Member

This bump in version requirement in evaluate is actually quite disruptive, it forces everyone using testthat to use >= 4.0

@vincentarelbundock
Copy link

This bump in version requirement in evaluate is actually quite disruptive, it forces everyone using testthat to use >= 4.0

Time to switch to tinytest? (Half-joking)

@IndrajeetPatil
Copy link
Member Author

Not just testthat, also knitr and rmarkdown.

I am not sure we need to abandon our policy yet. We could treat this as an exceptional event.

But this does reveal fragility of our policy since our development tooling is inherently tied to what is supported by tidyverse and r-lib organizations and their support policy.

@IndrajeetPatil
Copy link
Member Author

Time to switch to tinytest? (Half-joking)

I am open to this switch, but I would like to first have a rough estimate of how much work would be needed to transfer from the current to the proposed testing framework.

@vincentarelbundock
Copy link

I am open to this switch, but I would like to first have a rough estimate of how much work would be needed to transfer from the current to the proposed testing framework.

I guess I was more than "half" joking. It would almost certainly be way too much work.

I might be able to fund a student to do part of that work, but only if there's significant interest among easystats core devs. Otherwise, it's not worth it.

@strengejacke
Copy link
Member

This bump in version requirement in evaluate is actually quite disruptive, it forces everyone using testthat to use >= 4.0

It means that a package can only be tested down to R 4.0, but packages (may) still work on older versions. I'd say we still make our packages depend on R3.6, but just test from R 4.x onwards.

@bwiernik
Copy link
Contributor

It means that a package can only be tested down to R 4.0, but packages (may) still work on older versions. I'd say we still make our packages depend on R3.6, but just test from R 4.x onwards.

Agreed

@IndrajeetPatil
Copy link
Member Author

It means that a package can only be tested down to R 4.0, but packages (may) still work on older versions. I'd say we still make our packages depend on R3.6, but just test from R 4.x onwards.

If we are fine with not checking that tests and examples run on older R versions and that vignettes can be built, and only checking that the package can be installed, I don't think having an R-version support policy document makes sense.

We can drop the minimum required R version field from our DESCRIPTION files. I am fairly certain our packages can be installed on R versions older than R 3.6, like R 3.0 (e.g.). This way, we don't gatekeep anyone from using our packages, we just hope that it works as expected for them.

@bwiernik
Copy link
Contributor

I don't necessarily agree with that. I think the policy doc saying "we provide support and guarantees for 5 versions" is useful and informative. When we learn about an error with an older version, we bump the minimum required and don't fix.

@strengejacke
Copy link
Member

I am fairly certain our packages can be installed on R versions older than R 3.6, like R 3.0 (e.g.).

Not if we use functions like is false, sigma or str2lang, or?

@etiennebacher
Copy link
Member

Some discussion going on here if you're interested: r-lib/evaluate#173

@IndrajeetPatil
Copy link
Member Author

@strengejacke and @bwiernik I was being sardonic in my comment.

In this entire thread, I have kept arguing that it's a bad idea to keep supporting R versions that we don't test in any way. And, yet, somehow this exact thing has been suggested or proposed multiple times since this issue was created, regardless of how several ways I tried to underline the problems of this approach. I was showing the absurdity of this proposition by taking it to its absurd logical conclusion (support every R version just for the heck of it).

At this point, I feel like a dog chasing its own tail, and therefore I am unfollowing this issue. Clearly, I have failed to make a convincing argument, and any further engagement here just a waste of time

I will be fine with whatever you guys decide, w.r.t. R version support policy.

P.S. This issue has also revealed how poor our governance structure is: we tried consensus-seeking, but clearly we need a benevolent dictator model if anything is to be decided.

@yihui
Copy link
yihui commented Jun 25, 2024

I'm not sure I should chime in as a bystander... Anyway, I think at least everyone agrees that it sounds like a bad idea to support R versions that are not tested. Is it a truly bad idea? The answer is not 100% yes, because it really depends on which features you use from base R. Base R is known to be quite stable across versions (I'd even say it's stubbornly stable). Unless you must use new features that are available only in higher versions of R that are very hard to backport to lower versions, I tend to just let users decide the R version they want to use (also remember that not all users actually have such a choice), instead of forcing everyone to upgrade. As a developer, I do wish that everyone can upgrade everything, but I'd at most recommend so instead of forcing so.

I think it suffices to clarify in documentation that only a certain number versions of R are actually tested. If users choose or have to use a lower version of R, they must understand that there is no warranty (well, open source software generally doesn't come with any warranty anyway). If it happens to work well, high five and congratulations. Both users and developers are happy. If not, users always have the freedom to file an issue for supporting a lower version of R, and on the other hand, developers have the freedom to say no if the support appears to be difficult. We just respect each other's freedom.

I don't use easystats myself, but I think it's a great design choice to keep it lightweight, so it won't be forced to change due to certain changes in dependencies.

murmur

BTW, I use an iPhone that is more than 9 years old. I can't express how annoying it is to see that few apps could be installed on my phone today. The vast majority of apps require at least iOS 13+, which is impossible for me to upgrade to. Essentially, Apple is forcing me to buy a new phone, which makes me feel very unpleasant, even though I'm not a heavy phone user. When did fast deprecation become the norm of our society? Mobile devices, kitchen appliances, software packages, none can last a few years.

P.S. Perhaps we can feel better because Python is worse: https://github.com/activepapers/activepapers-python

@gaborcsardi
Copy link

One more thing to consider here is that while currently r-lib/actions (pak) still supports R 3.5.x, most likely this is going to change before R 4.5.0 is out and it'll only support R 4.0.x and later.

@DominiqueMakowski
Copy link
Member
DominiqueMakowski commented Jun 25, 2024

Thanks @IndrajeetPatil for your tenacious attempts at improving the infrastructure

I still believe we should only support > 4.0 rather than wasting resources needlessly. I would further argue that there are 2 categories of users when it comes to this issue:

  • People that actively use and update easystats packages: they are used to updating various packages and likely R as well given that other packages (for which easystats provides the postprocessing) require it. For them, it won't be a problem
  • Systems that are not easily upgradable that had everything installed at some point in time and since then "it works" and it would be a pain to reinstall everything. For these users, if they already have an (old) easystats it works, and probably these systems are not interested in specifically updating only easystats anyway, so it won't be a problem either.

While we can only support officially > 4.0 (i.e., with tests etc.), we can still encourage users of older versions to eventually report bugs or issues as it might still be informative and positive to fix them

@hadley
Copy link
hadley commented Jun 25, 2024

I don't fully understand the easystats ecosystem dependency structure, but currently easystats itself has 39 dependencies. I'd consider that 16 of them are likely covered by the tidyverse versioning policy (R6, farver, ggplot2, glue, gtable, isoband, lifecycle, magrittr, pillar, pkgconfig, scales, tibble, vctrs, withr, cli, rlang). That means the next release of any of those packages could potentially change the R dependency to 4.0.0 (in line with our stated policy), making easystats implicitly depend on R 4.0.0.

Additionally, because the tidyverse version policy is a written policy that has been successful (at least in the sense of not generating complaints that we don't support even older versions of R), it has become the de facto standard whenever folks at Posit have needed a version policy. This means that, as Gabor stated, r-lib/actions might not support older versions of R, our commercial products might not support older versions of R, we don't provide support for older versions of R etc etc.

So I think you'll find trying to support older versions of R to be a lot of extra effort. And personally, I think you'd get more benefit from spending that time fixing bugs and adding new features, rather than continuing to support versions of R used by very few people.

@strengejacke
Copy link
Member
strengejacke commented Jun 25, 2024

You can roughly split our package ecosystem into two categories: the "core" packages, that are used internally, but also by an increasing number of external packages. For these packages, we follow a "light-weight" policy, meaning that only R core packages are required to run those packages. And we have a few packages (especially see) that depend on more packages, especially ggplot2 (see also https://easystats.github.io/easystats/#dependencies). That's why most of the easystats packages should run on R3.6 (or even older?), indepenent from other R packages.

There are two reasons why we have this discussion:

  1. Some external packages that probably have a larger userbase (effects, marginaleffects, modelsummary), import our packages, in particular insinght, and those packages depend on R3.6. Therefore, it makes sense to allow our packages also to continue to work on R3.6

  2. Two polls (one on Twitter, one on linked in, both from different accounts) yielded a share of larger than 15% (up to 20%, I think) who are forced to use R3.6. Given these polls are not representative, and being optimistic, we could maybe have a potential userbase of 5-10% who still need to use R3.6, which we think is a lot to simply ignore it.

So I think you'll find trying to support older versions of R to be a lot of extra effort.

Only if we really "maintain" and support it. The idea is to "just let it work" for R3.6, but maintain/support from R4+. Since we rely on R core packages anyway, we wouldn't expect too much troubles for users who use R3.6 (unless we start using native pipes in our codebase, which we don't). People who use R3.6 can use our packages "at own risk", but support is only provided for R4+. I don't think this adds any relevant extra effort.

@hadley
Copy link
hadley commented Jun 25, 2024

I guess the main difference in our philosophy comes down to this "just let it work". I'd prefer to provide an explicit, enforced, guarantee that tidyverse code works for the stated versions of R, and if you want to use older versions of R you need to use older versions of packages. (I also feel that there is some duty of the tidyverse to encourage people to keep relatively current with R versions especially since R core doesn't release patches for older versions of R.)

(BTW are the arrows in that dependencies diagram the wrong way around? e.g. see imports ggplot2, not the other way around.)

@strengejacke
Copy link
Member

(BTW are the arrows in that dependencies diagram the wrong way around? e.g. see imports ggplot2, not the other way around.)

We could file an issue at the related repo ;-) I guess the idea was that the arrows point in the direction of the imports? (i.e. ggplot "goes to" see, i.e. "is imported by")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment