[go: up one dir, main page]

IDEAS home Printed from https://ideas.repec.org/p/azt/cemmap/61-17.html
   My bibliography  Save this paper

Generic machine learning inference on heterogenous treatment effects in randomized experiments

Author

Listed:
  • Victor Chernozhukov
  • Mert Demirer
  • Esther Duflo
  • Ivan Fernandez-Val
Abstract
We propose strategies to estimate and make inference on key features of heterogeneous effects in randomized experiments. These key features include best linear predictors of the effects using machine learning proxies, average effects sorted by impact groups, and average characteristics of most and least impacted units. The approach is valid in high dimensional settings, where the effects are proxied by machine learning methods. We post-process these proxies into the estimates of the key features. Our approach is generic, it can be used in conjunction with penalized methods, deep and shallow neural networks, canonical and new random forests, boosted trees, and ensemble methods. Our approach is agnostic and does not make unrealistic or hard-to-check assumptions; we don’t require conditions for consistency of the ML methods. Estimation and inference relies on repeated data splitting to avoid overfitting and achieve validity. For inference, we take medians of p-values and medians of confidence intervals, resulting from many different data splits, and then adjust their nominal level to guarantee uniform validity. This variational inference method is shown to be uniformly valid and quantifies the uncertainty coming from both parameter estimation and data splitting. The inference method could be of substantial independent interest in many machine learning applications. An empirical application to the impact of micro-credit on economic development illustrates the use of the approach in randomized experiments. An additional application to the impact of the gender discrimination on wages illustrates the potential use of the approach in observational studies, where machine learning methods can be used to condition flexibly on very high-dimensional controls.

Suggested Citation

  • Victor Chernozhukov & Mert Demirer & Esther Duflo & Ivan Fernandez-Val, 2017. "Generic machine learning inference on heterogenous treatment effects in randomized experiments," CeMMAP working papers 61/17, Institute for Fiscal Studies.
  • Handle: RePEc:azt:cemmap:61/17
    DOI: 10.1920/wp.cem.2017.6117
    as

    Download full text from publisher

    File URL: https://www.cemmap.ac.uk/wp-content/uploads/2020/08/CWP6117.pdf
    Download Restriction: no

    File URL: https://libkey.io/10.1920/wp.cem.2017.6117?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Victor Chernozhukov & Iván Fernández‐Val & Ye Luo, 2018. "The Sorted Effects Method: Discovering Heterogeneous Effects Beyond Their Averages," Econometrica, Econometric Society, vol. 86(6), pages 1911-1938, November.
    2. Meinshausen, Nicolai & Meier, Lukas & Bühlmann, Peter, 2009. "p-Values for High-Dimensional Regression," Journal of the American Statistical Association, American Statistical Association, vol. 104(488), pages 1671-1681.
    3. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2016. "Double/Debiased Machine Learning for Treatment and Causal Parameters," Papers 1608.00060, arXiv.org, revised Nov 2024.
    4. Stefan Wager & Susan Athey, 2018. "Estimation and Inference of Heterogeneous Treatment Effects using Random Forests," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1228-1242, July.
    5. Bruno Crépon & Florencia Devoto & Esther Duflo & William Parienté, 2015. "Estimating the Impact of Microcredit on Those Who Take It Up: Evidence from a Randomized Experiment in Morocco," American Economic Journal: Applied Economics, American Economic Association, vol. 7(1), pages 123-150, January.
    6. Duflo, Esther & Glennerster, Rachel & Kremer, Michael, 2008. "Using Randomization in Development Economics Research: A Toolkit," Handbook of Development Economics, in: T. Paul Schultz & John A. Strauss (ed.), Handbook of Development Economics, edition 1, volume 4, chapter 61, pages 3895-3962, Elsevier.
    7. Alessandro Tarozzi & Jaikishan Desai & Kristin Johnson, 2015. "The Impacts of Microcredit: Evidence from Ethiopia," American Economic Journal: Applied Economics, American Economic Association, vol. 7(1), pages 54-89, January.
    8. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    9. V. Chernozhukov & I. Fernández-Val & A. Galichon, 2009. "Improving point and interval estimators of monotone functions by rearrangement," Biometrika, Biometrika Trust, vol. 96(3), pages 559-575.
    10. Abhijit Banerjee & Esther Duflo & Rachel Glennerster & Cynthia Kinnan, 2015. "The Miracle of Microfinance? Evidence from a Randomized Evaluation," American Economic Journal: Applied Economics, American Economic Association, vol. 7(1), pages 22-53, January.
    11. Jonathan M.V. Davis & Sara B. Heller, 2017. "Rethinking the Benefits of Youth Employment Programs: The Heterogeneous Effects of Summer Jobs," NBER Working Papers 23443, National Bureau of Economic Research, Inc.
    12. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "Inference on Treatment Effects after Selection among High-Dimensional Controlsâ€," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 81(2), pages 608-650.
    13. Karlan, Dean S. & Zinman, Jonathan, 2009. "Expanding Microenterprise Credit Access: Using Randomized Supply Decisions to Estimate the Impacts in Manila," Center Discussion Papers 52600, Yale University, Economic Growth Center.
    14. Alberto Abadie, 2005. "Semiparametric Difference-in-Differences Estimators," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 72(1), pages 1-19.
    15. James Albrecht & Anders Bjorklund & Susan Vroman, 2003. "Is There a Glass Ceiling in Sweden?," Journal of Labor Economics, University of Chicago Press, vol. 21(1), pages 145-177, January.
    16. Francine D. Blau & Lawrence M. Kahn, 2017. "The Gender Wage Gap: Extent, Trends, and Explanations," Journal of Economic Literature, American Economic Association, vol. 55(3), pages 789-865, September.
    17. Keisuke Hirano & Guido W. Imbens & Geert Ridder, 2003. "Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score," Econometrica, Econometric Society, vol. 71(4), pages 1161-1189, July.
    18. Alexandre Belloni & Victor Chernozhukov & Kengo Kato, 2013. "Uniform post selection inference for LAD regression models," CeMMAP working papers CWP24/13, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    19. Christian Hansen & Damian Kozbur & Sanjog Misra, 2016. "Targeted undersmoothing," ECON - Working Papers 282, Department of Economics - University of Zurich, revised Apr 2018.
    20. V. Chernozhukov & I. Fernández-Val & A. Galichon, 2009. "Improving point and interval estimators of monotone functions by rearrangement," Biometrika, Biometrika Trust, vol. 96(3), pages 559-575.
    21. Abhijit Vinayak Banerjee, 2013. "Microcredit Under the Microscope: What Have We Learned in the Past Two Decades, and What Do We Need to Know?," Annual Review of Economics, Annual Reviews, vol. 5(1), pages 487-519, May.
    22. Britta Augsburg & Ralph De Haas & Heike Harmgart & Costas Meghir, 2012. "Microfinance, Poverty and Education," IFS Working Papers W12/15, Institute for Fiscal Studies.
    23. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881, September.
    24. Manuela Angelucci & Dean Karlan & Jonathan Zinman, 2015. "Microcredit Impacts: Evidence from a Randomized Microcredit Program Placement Experiment by Compartamos Banco," American Economic Journal: Applied Economics, American Economic Association, vol. 7(1), pages 151-182, January.
    25. Karlan, Dean & Zinman, Jonathan, 2009. "Expanding Microenterprise Credit Access: Randomized Supply Decisions to Estimate the Impacts in Manila," Working Papers 68, Yale University, Department of Economics.
    26. Orazio Attanasio & Britta Augsburg & Ralph De Haas & Emla Fitzsimons & Heike Harmgart, 2015. "The Impacts of Microfinance: Evidence from Joint-Liability Lending in Mongolia," American Economic Journal: Applied Economics, American Economic Association, vol. 7(1), pages 90-122, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Stephen Coussens & Jann Spiess, 2021. "Improving Inference from Simple Instruments through Compliance Estimation," Papers 2108.03726, arXiv.org.
    2. Matias D. Cattaneo & Max H. Farrell & Yingjie Feng, 2018. "Large Sample Properties of Partitioning-Based Series Estimators," Papers 1804.04916, arXiv.org, revised Jun 2019.
    3. Michael C Knaus, 2022. "Double machine learning-based programme evaluation under unconfoundedness [Econometric methods for program evaluation]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 602-627.
    4. Daniel J. Lewis & Davide Melcangi & Laura Pilossoph, 2019. "Latent Heterogeneity in the Marginal Propensity to Consume," Staff Reports 902, Federal Reserve Bank of New York.
    5. Bluwstein, Kristina & Buckmann, Marcus & Joseph, Andreas & Kapadia, Sujit & Şimşek, Özgür, 2023. "Credit growth, the yield curve and financial crisis prediction: Evidence from a machine learning approach," Journal of International Economics, Elsevier, vol. 145(C).
    6. O'Neill, E. & Weeks, M., 2018. "Causal Tree Estimation of Heterogeneous Household Response to Time-Of-Use Electricity Pricing Schemes," Cambridge Working Papers in Economics 1865, Faculty of Economics, University of Cambridge.
    7. Vira Semenova, 2020. "Generalized Lee Bounds," Papers 2008.12720, arXiv.org, revised Feb 2023.
    8. Armand, Alex & Augsburg, Britta & Bancalari, Antonella & Ghatak, Maitreesh, 2023. "Public Service Delivery, Exclusion and Externalities: Theory and Experimental Evidence from India," CEPR Discussion Papers 18636, C.E.P.R. Discussion Papers.
    9. Alex Armand & Britta Augsburg & Antonella Bancalari, 2021. "Coordination and the poor maintenance trap: an experiment on public infrastructure in India," IFS Working Papers W21/16, Institute for Fiscal Studies.
    10. Paul B. Ellickson & Wreetabrata Kar & James C. Reeder, 2023. "Estimating Marketing Component Effects: Double Machine Learning from Targeted Digital Promotions," Marketing Science, INFORMS, vol. 42(4), pages 704-728, July.
    11. Riccardo Di Francesco, 2024. "Aggregation Trees," Papers 2410.11408, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Victor Chernozhukov & Mert Demirer & Esther Duflo & Iv'an Fern'andez-Val, 2017. "Fisher-Schultz Lecture: Generic Machine Learning Inference on Heterogenous Treatment Effects in Randomized Experiments, with an Application to Immunization in India," Papers 1712.04802, arXiv.org, revised Oct 2023.
    2. Pedro Carneiro & Sokbae Lee & Daniel Wilhelm, 2020. "Optimal data collection for randomized control trials," The Econometrics Journal, Royal Economic Society, vol. 23(1), pages 1-31.
    3. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    4. Jonathan Fu & Annette Krauss, 2024. "Preparing fertile ground: how does the quality of business environments affect MSE growth?," Small Business Economics, Springer, vol. 63(1), pages 51-103, June.
    5. Nusrat Abedin Jimi & Plamen V. Nikolov & Mohammad Abdul Malek & Subal Kumbhakar, 2019. "The effects of access to credit on productivity: separating technological changes from changes in technical efficiency," Journal of Productivity Analysis, Springer, vol. 52(1), pages 37-55, December.
    6. Fumagalli, Laura & Martin, Thomas, 2023. "Child labor among farm households in Mozambique and the role of reciprocal adult labor," World Development, Elsevier, vol. 161(C).
    7. Lota Tamini & Ibrahima Bocoum & Ghislain Auger & Kotchikpa Gabriel Lawin & Arahama Traoré, 2019. "Enhanced Microfinance Services and Agricultural Best Management Practices: What Benefits for Smallholders Farmers? An Evidence from Burkina Faso," CIRANO Working Papers 2019s-11, CIRANO.
    8. Nusrat Abedin Jimi & Plamen Nikolov & Mohammad Abdul Malek & Subal Kumbhakar, 2020. "The Effects of Access to Credit on Productivity Among Microenterprises: Separating Technological Changes from Changes in Technical Efficiency," Papers 2006.03650, arXiv.org.
    9. Abhijit Banerjee & Emily Breza & Esther Duflo & Cynthia Kinnan, 2019. "Can Microfinance Unlock a Poverty Trap for Some Entrepreneurs?," NBER Working Papers 26346, National Bureau of Economic Research, Inc.
    10. Dammert, Ana C. & de Hoop, Jacobus & Mvukiyehe, Eric & Rosati, Furio C., 2018. "Effects of public policy on child labor: Current knowledge, gaps, and implications for program design," World Development, Elsevier, vol. 110(C), pages 104-123.
    11. Ahlin, Christian & Gulesci, Selim & Madestam, Andreas & Stryjan, Miri, 2020. "Loan contract structure and adverse selection: Survey evidence from Uganda," Journal of Economic Behavior & Organization, Elsevier, vol. 172(C), pages 180-195.
    12. Kersten, Renate & Harms, Job & Liket, Kellie & Maas, Karen, 2017. "Small Firms, large Impact? A systematic review of the SME Finance Literature," World Development, Elsevier, vol. 97(C), pages 330-348.
    13. Meager, Rachael & Sturdy, Jennifer, 2017. "Aggregating Distributional Treatment Effects: A Bayesian Hierarchical Analysis of the Microcredit Literature," MetaArXiv 7tkvm, Center for Open Science.
    14. Rachael Meager, 2015. "Understanding the Impact of Microcredit Expansions: A Bayesian Hierarchical Analysis of 7 Randomised Experiments," Papers 1506.06669, arXiv.org, revised Jul 2016.
    15. Michael Zimmert & Michael Lechner, 2019. "Nonparametric estimation of causal heterogeneity under high-dimensional confounding," Papers 1908.08779, arXiv.org.
    16. Nakano, Yuko & Magezi, Eustadius F., 2020. "The impact of microcredit on agricultural technology adoption and productivity: Evidence from randomized control trial in Tanzania," World Development, Elsevier, vol. 133(C).
    17. Oriana Bandiera & Robin Burgess & Erika Deserranno & Ricardo Morel & Imran Rasul & Munshi Sulaiman & Jack Thiemel, 2022. "Microfinance and Diversification," Economica, London School of Economics and Political Science, vol. 89(S1), pages 239-275, June.
    18. Athey, Susan & Imbens, Guido W. & Metzger, Jonas & Munro, Evan, 2024. "Using Wasserstein Generative Adversarial Networks for the design of Monte Carlo simulations," Journal of Econometrics, Elsevier, vol. 240(2).
    19. Daniel Kandie & Khan Jahirul Islam, 2022. "A new era of microfinance: The digital microcredit and its impact on poverty," Journal of International Development, John Wiley & Sons, Ltd., vol. 34(3), pages 469-492, April.
    20. Aparajithan Venkateswaran & Anirudh Sankar & Arun G. Chandrasekhar & Tyler H. McCormick, 2024. "Robustly estimating heterogeneity in factorial data using Rashomon Partitions," Papers 2404.02141, arXiv.org, revised Aug 2024.

    More about this item

    JEL classification:

    • C18 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Methodolical Issues: General
    • C21 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Cross-Sectional Models; Spatial Models; Treatment Effect Models
    • D14 - Microeconomics - - Household Behavior - - - Household Saving; Personal Finance
    • G21 - Financial Economics - - Financial Institutions and Services - - - Banks; Other Depository Institutions; Micro Finance Institutions; Mortgages
    • O16 - Economic Development, Innovation, Technological Change, and Growth - - Economic Development - - - Financial Markets; Saving and Capital Investment; Corporate Finance and Governance

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:azt:cemmap:61/17. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Dermot Watson (email available below). General contact details of provider: https://edirc.repec.org/data/ifsssuk.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.