[go: up one dir, main page]

IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2403.14385.html
   My bibliography  Save this paper

Estimating Causal Effects with Double Machine Learning -- A Method Evaluation

Author

Listed:
  • Jonathan Fuhr
  • Philipp Berens
  • Dominik Papies
Abstract
The estimation of causal effects with observational data continues to be a very active research area. In recent years, researchers have developed new frameworks which use machine learning to relax classical assumptions necessary for the estimation of causal effects. In this paper, we review one of the most prominent methods - "double/debiased machine learning" (DML) - and empirically evaluate it by comparing its performance on simulated data relative to more traditional statistical methods, before applying it to real-world data. Our findings indicate that the application of a suitably flexible machine learning algorithm within DML improves the adjustment for various nonlinear confounding relationships. This advantage enables a departure from traditional functional form assumptions typically necessary in causal effect estimation. However, we demonstrate that the method continues to critically depend on standard assumptions about causal structure and identification. When estimating the effects of air pollution on housing prices in our application, we find that DML estimates are consistently larger than estimates of less flexible methods. From our overall results, we provide actionable recommendations for specific choices researchers must make when applying DML in practice.

Suggested Citation

  • Jonathan Fuhr & Philipp Berens & Dominik Papies, 2024. "Estimating Causal Effects with Double Machine Learning -- A Method Evaluation," Papers 2403.14385, arXiv.org, revised Apr 2024.
  • Handle: RePEc:arx:papers:2403.14385
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2403.14385
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Guido W. Imbens, 2020. "Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics," Journal of Economic Literature, American Economic Association, vol. 58(4), pages 1129-1179, December.
    2. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "High-Dimensional Methods and Inference on Structural and Treatment Effects," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 29-50, Spring.
    3. Bilancini, Ennio & Boncinelli, Leonardo & Di Paolo, Roberto & Menicagli, Dario & Pizziol, Veronica & Ricciardi, Emiliano & Serti, Francesco, 2022. "Prosocial behavior in emergencies: Evidence from blood donors recruitment and retention during the COVID-19 pandemic," Social Science & Medicine, Elsevier, vol. 314(C).
    4. Harold D. Chiang & Kengo Kato & Yukun Ma & Yuya Sasaki, 2022. "Multiway Cluster Robust Double/Debiased Machine Learning," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 40(3), pages 1046-1056, June.
    5. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    6. Michael C. Knaus, 2021. "A double machine learning approach to estimate the effects of musical practice on student’s skills," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(1), pages 282-300, January.
    7. Harrison, David Jr. & Rubinfeld, Daniel L., 1978. "Hedonic housing prices and the demand for clean air," Journal of Environmental Economics and Management, Elsevier, vol. 5(1), pages 81-102, March.
    8. Robinson, Peter M, 1988. "Root- N-Consistent Semiparametric Regression," Econometrica, Econometric Society, vol. 56(4), pages 931-954, July.
    9. Alexandre Belloni & Victor Chernozhukov & Christian Hansen & Damian Kozbur, 2016. "Inference in High-Dimensional Panel Models With an Application to Gun Control," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(4), pages 590-605, October.
    10. Susan Athey & Guido W. Imbens, 2017. "The State of Applied Econometrics: Causality and Policy Evaluation," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 3-32, Spring.
    11. Michael C Knaus, 2022. "Double machine learning-based programme evaluation under unconfoundedness [Econometric methods for program evaluation]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 602-627.
    12. Helmut Farbmacher & Martin Huber & Lukáš Lafférs & Henrika Langen & Martin Spindler, 2022. "Causal mediation analysis with double machine learning [Mediation analysis via potential outcomes models]," The Econometrics Journal, Royal Economic Society, vol. 25(2), pages 277-300.
    13. Jau-er Chen & Chien-Hsun Huang & Jia-Jyun Tien, 2021. "Debiased/Double Machine Learning for Instrumental Variable Quantile Regressions," Econometrics, MDPI, vol. 9(2), pages 1-18, April.
    14. Sendhil Mullainathan & Jann Spiess, 2017. "Machine Learning: An Applied Econometric Approach," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 87-106, Spring.
    15. Neng-Chieh Chang, 2020. "Double/debiased machine learning for difference-in-differences models," The Econometrics Journal, Royal Economic Society, vol. 23(2), pages 177-191.
    16. E. F. Beach, 1949. "The Use of Polynomials to Represent Cost Functions," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 16(3), pages 158-169.
    17. Hugo Bodory & Martin Huber & Lukáš Lafférs, 2022. "Evaluating (weighted) dynamic treatment effects by double machine learning [Identification of causal effects using instrumental variables]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 628-648.
    18. Guido W. Imbens & Jeffrey M. Wooldridge, 2009. "Recent Developments in the Econometrics of Program Evaluation," Journal of Economic Literature, American Economic Association, vol. 47(1), pages 5-86, March.
    19. Jeffrey M Wooldridge, 2010. "Econometric Analysis of Cross Section and Panel Data," MIT Press Books, The MIT Press, edition 2, volume 1, number 0262232588, April.
    20. A. Belloni & V. Chernozhukov & I. Fernández‐Val & C. Hansen, 2017. "Program Evaluation and Causal Inference With High‐Dimensional Data," Econometrica, Econometric Society, vol. 85, pages 233-298, January.
    21. van der Laan Mark J. & Rubin Daniel, 2006. "Targeted Maximum Likelihood Learning," The International Journal of Biostatistics, De Gruyter, vol. 2(1), pages 1-40, December.
    22. O. Ashenfelter & D. Card (ed.), 1999. "Handbook of Labor Economics," Handbook of Labor Economics, Elsevier, edition 1, volume 3, number 3.
    23. Simon, Noah & Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2011. "Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 39(i05).
    24. Marra, Giampiero & Wood, Simon N., 2011. "Practical variable selection for generalized additive models," Computational Statistics & Data Analysis, Elsevier, vol. 55(7), pages 2372-2387, July.
    25. Joshua D. Angrist & Jörn-Steffen Pischke, 2009. "Mostly Harmless Econometrics: An Empiricist's Companion," Economics Books, Princeton University Press, edition 1, number 8769.
    26. Molei Liu & Yi Zhang & Doudou Zhou, 2021. "Double/debiased machine learning for logistic partially linear model," The Econometrics Journal, Royal Economic Society, vol. 24(3), pages 559-588.
    27. Imbens, Guido W & Angrist, Joshua D, 1994. "Identification and Estimation of Local Average Treatment Effects," Econometrica, Econometric Society, vol. 62(2), pages 467-475, March.
    28. Yang, Jui-Chung & Chuang, Hui-Ching & Kuan, Chung-Ming, 2020. "Double machine learning with gradient boosting and its application to the Big N audit quality effect," Journal of Econometrics, Elsevier, vol. 216(1), pages 268-283.
    29. Azoulay, Pierre & Greenblatt, Wesley H. & Heggeness, Misty L., 2021. "Long-term effects from early exposure to research: Evidence from the NIH “Yellow Berets”," Research Policy, Elsevier, vol. 50(9).
    30. Nelson, Jon P., 1978. "Residential choice, hedonic prices, and the demand for urban air quality," Journal of Urban Economics, Elsevier, vol. 5(3), pages 357-369, July.
    31. Guido W. Imbens, 2004. "Nonparametric Estimation of Average Treatment Effects Under Exogeneity: A Review," The Review of Economics and Statistics, MIT Press, vol. 86(1), pages 4-29, February.
    32. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881, September.
    33. Vira Semenova & Victor Chernozhukov, 2021. "Debiased machine learning of conditional average treatment effects and other causal functions," The Econometrics Journal, Royal Economic Society, vol. 24(2), pages 264-289.
    34. Brett R. Gordon & Robert Moakler & Florian Zettelmeyer, 2022. "Close Enough? A Large-Scale Exploration of Non-Experimental Approaches to Advertising Measurement," Papers 2201.07055, arXiv.org, revised Oct 2022.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Michael Lechner, 2023. "Causal Machine Learning and its use for public policy," Swiss Journal of Economics and Statistics, Springer;Swiss Society of Economics and Statistics, vol. 159(1), pages 1-15, December.
    2. Michael Lechner & Jana Mareckova, 2024. "Comprehensive Causal Machine Learning," Papers 2405.10198, arXiv.org.
    3. Jonathan Fuhr & Dominik Papies, 2024. "Double Machine Learning meets Panel Data -- Promises, Pitfalls, and Potential Solutions," Papers 2409.01266, arXiv.org.
    4. Michael C Knaus & Michael Lechner & Anthony Strittmatter, 2021. "Machine learning estimation of heterogeneous causal effects: Empirical Monte Carlo evidence," The Econometrics Journal, Royal Economic Society, vol. 24(1), pages 134-161.
    5. Mark Kattenberg & Bas Scheer & Jurre Thiel, 2023. "Causal forests with fixed effects for treatment effect heterogeneity in difference-in-differences," CPB Discussion Paper 452, CPB Netherlands Bureau for Economic Policy Analysis.
    6. Michael C Knaus, 2022. "Double machine learning-based programme evaluation under unconfoundedness [Econometric methods for program evaluation]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 602-627.
    7. Martin Huber & Jannis Kueck, 2022. "Testing the identification of causal effects in observational data," Papers 2203.15890, arXiv.org, revised Jun 2023.
    8. Huber, Martin, 2019. "An introduction to flexible methods for policy evaluation," FSES Working Papers 504, Faculty of Economics and Social Sciences, University of Freiburg/Fribourg Switzerland.
    9. Markku Maula & Wouter Stam, 2020. "Enhancing Rigor in Quantitative Entrepreneurship Research," Entrepreneurship Theory and Practice, , vol. 44(6), pages 1059-1090, November.
    10. Vira Semenova, 2020. "Generalized Lee Bounds," Papers 2008.12720, arXiv.org, revised Feb 2023.
    11. Ganesh Karapakula, 2023. "Stable Probability Weighting: Large-Sample and Finite-Sample Estimation and Inference Methods for Heterogeneous Causal Effects of Multivalued Treatments Under Limited Overlap," Papers 2301.05703, arXiv.org, revised Jan 2023.
    12. Nora Bearth & Michael Lechner, 2024. "Causal Machine Learning for Moderation Effects," Papers 2401.08290, arXiv.org, revised Apr 2024.
    13. Michael Pollmann, 2020. "Causal Inference for Spatial Treatments," Papers 2011.00373, arXiv.org, revised Jan 2023.
    14. Gabriel Okasa, 2022. "Meta-Learners for Estimation of Causal Effects: Finite Sample Cross-Fit Performance," Papers 2201.12692, arXiv.org.
    15. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    16. Dennis Shen & Peng Ding & Jasjeet Sekhon & Bin Yu, 2022. "Same Root Different Leaves: Time Series and Cross-Sectional Methods in Panel Data," Papers 2207.14481, arXiv.org, revised Oct 2022.
    17. Philipp Bach & Victor Chernozhukov & Malte S. Kurz & Martin Spindler & Sven Klaassen, 2021. "DoubleML -- An Object-Oriented Implementation of Double Machine Learning in R," Papers 2103.09603, arXiv.org, revised Jun 2024.
    18. Victor Chernozhukov & Carlos Cinelli & Whitney Newey & Amit Sharma & Vasilis Syrgkanis, 2021. "Long Story Short: Omitted Variable Bias in Causal Machine Learning," Papers 2112.13398, arXiv.org, revised May 2024.
    19. Guido W. Imbens, 2020. "Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics," Journal of Economic Literature, American Economic Association, vol. 58(4), pages 1129-1179, December.
    20. Huber, Martin & Meier, Jonas & Wallimann, Hannes, 2022. "Business analytics meets artificial intelligence: Assessing the demand effects of discounts on Swiss train tickets," Transportation Research Part B: Methodological, Elsevier, vol. 163(C), pages 22-39.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2403.14385. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.