[go: up one dir, main page]

IDEAS home Printed from https://ideas.repec.org/p/qed/wpaper/1428.html
   My bibliography  Save this paper

Testing for the appropriate level of clustering in linear regression models

Author

Listed:
  • James G. MacKinnon

    (Queen's University)

  • Morten Ørregaard Nielsen

    (Queen's University and CREATES)

  • Matthew D. Webb

    (Carleton University)

Abstract
The overwhelming majority of empirical research that uses cluster-robust inference assumes that the clustering structure is known, even though there are often several possible ways in which a dataset could be clustered. We propose two tests for the correct level of clustering in regression models. One test focuses on inference about a single coefficient, and the other on inference about two or more coefficients. We provide both asymptotic and wild bootstrap implementations. The proposed tests work for a null hypothesis of either no clustering or "fine" clustering against alternatives of "coarser" clustering. We also propose a sequential testing procedure to determine the appropriate level of clustering. Simulations suggest that the bootstrap tests perform very well under the null hypothesis and can have excellent power. An empirical example suggests that using the tests leads to sensible inferences.

Suggested Citation

  • James G. MacKinnon & Morten Ørregaard Nielsen & Matthew D. Webb, 2022. "Testing for the appropriate level of clustering in linear regression models," Working Paper 1428, Economics Department, Queen's University.
  • Handle: RePEc:qed:wpaper:1428
    as

    Download full text from publisher

    File URL: https://www.econ.queensu.ca/sites/econ.queensu.ca/files/wpaper/qed_wp_1428.pdf
    File Function: Third version 2022
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. James G. MacKinnon, 2019. "How cluster-robust inference is changing applied econometrics," Canadian Journal of Economics, Canadian Economics Association, vol. 52(3), pages 851-881, August.
    2. James G. MacKinnon & Morten Ørregaard Nielsen & Matthew D. Webb, 2023. "Leverage, influence, and the jackknife in clustered regression models: Reliable inference using summclust," Stata Journal, StataCorp LP, vol. 23(4), pages 942-982, December.
    3. Alan B. Krueger, 1999. "Experimental Estimates of Education Production Functions," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 114(2), pages 497-532.
    4. James G. MacKinnon & Morten Ørregaard Nielsen & Matthew D. Webb, 2021. "Wild Bootstrap and Asymptotic Inference With Multiway Clustering," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 39(2), pages 505-519, March.
    5. Guido W. Imbens & Michal Kolesár, 2016. "Robust Standard Errors in Small Samples: Some Practical Advice," The Review of Economics and Statistics, MIT Press, vol. 98(4), pages 701-712, October.
    6. repec:clg:wpaper:2013-20 is not listed on IDEAS
    7. James G. MacKinnon & Matthew D. Webb, 2018. "The wild bootstrap for few (treated) clusters," Econometrics Journal, Royal Economic Society, vol. 21(2), pages 114-135, June.
    8. A. Colin Cameron & Jonah B. Gelbach & Douglas L. Miller, 2008. "Bootstrap-Based Improvements for Inference with Clustered Errors," The Review of Economics and Statistics, MIT Press, vol. 90(3), pages 414-427, August.
    9. Davidson, Russell & Flachaire, Emmanuel, 2008. "The wild bootstrap, tamed at last," Journal of Econometrics, Elsevier, vol. 146(1), pages 162-169, September.
    10. Timothy Conley & Silvia Gonçalves & Christian Hansen, 2018. "Inference with Dependent Data in Accounting and Finance Applications," Journal of Accounting Research, Wiley Blackwell, vol. 56(4), pages 1139-1203, September.
    11. Clément de Chaisemartin & Jaime Ramirez-Cuellar, 2024. "At What Level Should One Cluster Standard Errors in Paired and Small-Strata Experiments?," American Economic Journal: Applied Economics, American Economic Association, vol. 16(1), pages 193-212, January.
    12. A. Colin Cameron & Jonah B. Gelbach & Douglas L. Miller, 2011. "Robust Inference With Multiway Clustering," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 29(2), pages 238-249, April.
    13. Matthew D. Webb, 2023. "Reworking wild bootstrap‐based inference for clustered errors," Canadian Journal of Economics/Revue canadienne d'économique, John Wiley & Sons, vol. 56(3), pages 839-858, August.
    14. James G. MacKinnon & Morten Ørregaard Nielsen & Matthew D. Webb, 2023. "Fast and reliable jackknife and bootstrap methods for cluster‐robust inference," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 38(5), pages 671-694, August.
    15. Rustam Ibragimov & Ulrich K. Müller, 2016. "Inference with Few Heterogeneous Clusters," The Review of Economics and Statistics, MIT Press, vol. 98(1), pages 83-96, March.
    16. MacKinnon, James G. & Nielsen, Morten Ørregaard & Webb, Matthew D., 2023. "Cluster-robust inference: A guide to empirical practice," Journal of Econometrics, Elsevier, vol. 232(2), pages 272-299.
    17. Djogbenou, Antoine A. & MacKinnon, James G. & Nielsen, Morten Ørregaard, 2019. "Asymptotic theory and wild bootstrap inference with clustered errors," Journal of Econometrics, Elsevier, vol. 212(2), pages 393-412.
    18. White, Halbert, 1980. "A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity," Econometrica, Econometric Society, vol. 48(4), pages 817-838, May.
    19. Davidson, Russell & MacKinnon, James G., 2006. "The power of bootstrap and asymptotic tests," Journal of Econometrics, Elsevier, vol. 133(2), pages 421-441, August.
    20. Hansen, Bruce E. & Lee, Seojeong, 2019. "Asymptotic theory for clustered samples," Journal of Econometrics, Elsevier, vol. 210(2), pages 268-290.
    21. Alberto Abadie & Susan Athey & Guido W Imbens & Jeffrey M Wooldridge, 2023. "When Should You Adjust Standard Errors for Clustering?," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 138(1), pages 1-35.
    22. Esarey, Justin & Menger, Andrew, 2019. "Practical and Effective Approaches to Dealing With Clustered Data," Political Science Research and Methods, Cambridge University Press, vol. 7(3), pages 541-559, July.
    23. Russell Davidson & James MacKinnon, 2000. "Bootstrap tests: how many bootstraps?," Econometric Reviews, Taylor & Francis Journals, vol. 19(1), pages 55-68.
    24. James G. MacKinnon & Matthew D. Webb, 2019. "Wild Bootstrap Randomization Inference for Few Treated Clusters," Advances in Econometrics, in: The Econometrics of Complex Survey Data, volume 39, pages 61-85, Emerald Group Publishing Limited.
    25. Cho, Jin Seo & Phillips, Peter C.B., 2018. "Pythagorean generalization of testing the equality of two symmetric positive definite matrices," Journal of Econometrics, Elsevier, vol. 202(1), pages 45-56.
    26. Hausman, Jerry, 2015. "Specification tests in econometrics," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 38(2), pages 112-134.
    27. Konrad Menzel, 2021. "Bootstrap With Cluster‐Dependence in Two or More Dimensions," Econometrica, Econometric Society, vol. 89(5), pages 2143-2188, September.
    28. Leeb, Hannes & Pötscher, Benedikt M., 2005. "Model Selection And Inference: Facts And Fiction," Econometric Theory, Cambridge University Press, vol. 21(1), pages 21-59, February.
    29. MacKinnon, James G. & White, Halbert, 1985. "Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties," Journal of Econometrics, Elsevier, vol. 29(3), pages 305-325, September.
    30. Andrew V. Carter & Kevin T. Schnepel & Douglas G. Steigerwald, 2017. "Asymptotic Behavior of a t -Test Robust to Cluster Heterogeneity," The Review of Economics and Statistics, MIT Press, vol. 99(4), pages 698-709, July.
    31. Bester, C. Alan & Conley, Timothy G. & Hansen, Christian B., 2011. "Inference with dependent data using cluster covariance estimators," Journal of Econometrics, Elsevier, vol. 165(2), pages 137-151.
    32. James G. MacKinnon & Matthew D. Webb, 2018. "The wild bootstrap for few (treated) clusters," Econometrics Journal, Royal Economic Society, vol. 21(2), pages 114-135, June.
    33. David Roodman & James G. MacKinnon & Morten Ørregaard Nielsen & Matthew D. Webb, 2019. "Fast and wild: Bootstrap inference in Stata using boottest," Stata Journal, StataCorp LP, vol. 19(1), pages 4-60, March.
    34. Yong Cai, 2021. "A Modified Randomization Test for the Level of Clustering," Papers 2105.01008, arXiv.org, revised Jan 2022.
    35. Marianne Bertrand & Esther Duflo & Sendhil Mullainathan, 2004. "How Much Should We Trust Differences-In-Differences Estimates?," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 119(1), pages 249-275.
    36. Davidson, Russell & MacKinnon, James G, 1992. "A New Form of the Information Matrix Test," Econometrica, Econometric Society, vol. 60(1), pages 145-157, January.
    37. James G. MacKinnon & Matthew D. Webb, 2017. "Wild Bootstrap Inference for Wildly Different Cluster Sizes," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 32(2), pages 233-254, March.
    38. A. Colin Cameron & Douglas L. Miller, 2015. "A Practitioner’s Guide to Cluster-Robust Inference," Journal of Human Resources, University of Wisconsin Press, vol. 50(2), pages 317-372.
    39. Ibragimov, Rustam & Müller, Ulrich K., 2010. "t-Statistic Based Correlation and Heterogeneity Robust Inference," Journal of Business & Economic Statistics, American Statistical Association, vol. 28(4), pages 453-468.
    40. Moulton, Brent R., 1986. "Random group effects and the precision of regression estimates," Journal of Econometrics, Elsevier, vol. 32(3), pages 385-397, August.
    41. Xiaohong Chen & Norman R. Swanson (ed.), 2013. "Recent Advances and Future Directions in Causality, Prediction, and Specification Analysis," Springer Books, Springer, edition 127, number 978-1-4614-1653-1, December.
    42. White, Halbert, 1982. "Maximum Likelihood Estimation of Misspecified Models," Econometrica, Econometric Society, vol. 50(1), pages 1-25, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. MacKinnon, James G. & Nielsen, Morten Ørregaard & Webb, Matthew D., 2023. "Cluster-robust inference: A guide to empirical practice," Journal of Econometrics, Elsevier, vol. 232(2), pages 272-299.
    2. Yong Cai, 2021. "Panel Data with Unknown Clusters," Papers 2106.05503, arXiv.org, revised Jan 2022.
    3. James G. MacKinnon & Morten Ørregaard Nielsen & Matthew D. Webb, 2023. "Leverage, influence, and the jackknife in clustered regression models: Reliable inference using summclust," Stata Journal, StataCorp LP, vol. 23(4), pages 942-982, December.
    4. Chaeho Chase Lee & Erdal Atukeren & Hohyun Kim, 2024. "Knowledge Capital and Stock Returns during Crises in the Manufacturing Sector: Moderating Role of Market Share, Tobin’s Q, and Cash Holdings," Risks, MDPI, vol. 12(6), pages 1-23, June.
    5. Paul Hufe, 2024. "The Parental Wage Gap and the Development of Socio-emotional Skills in Children," Working Papers 2024-010, Human Capital and Economic Opportunity Working Group.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. MacKinnon, James G. & Nielsen, Morten Ørregaard & Webb, Matthew D., 2023. "Cluster-robust inference: A guide to empirical practice," Journal of Econometrics, Elsevier, vol. 232(2), pages 272-299.
    2. James G. MacKinnon & Matthew D. Webb, 2020. "When and How to Deal with Clustered Errors in Regression Models," Working Paper 1421, Economics Department, Queen's University.
    3. James G. MacKinnon & Morten Ørregaard Nielsen & Matthew D. Webb, 2023. "Fast and reliable jackknife and bootstrap methods for cluster‐robust inference," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 38(5), pages 671-694, August.
    4. James G. MacKinnon & Morten Ørregaard Nielsen & Matthew D. Webb, 2021. "Wild Bootstrap and Asymptotic Inference With Multiway Clustering," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 39(2), pages 505-519, March.
    5. Matthew D. Webb, 2023. "Reworking wild bootstrap‐based inference for clustered errors," Canadian Journal of Economics/Revue canadienne d'économique, John Wiley & Sons, vol. 56(3), pages 839-858, August.
    6. Hansen, Bruce E. & Lee, Seojeong, 2019. "Asymptotic theory for clustered samples," Journal of Econometrics, Elsevier, vol. 210(2), pages 268-290.
    7. James G. MacKinnon & Morten Ørregaard Nielsen & Matthew D. Webb, 2023. "Leverage, influence, and the jackknife in clustered regression models: Reliable inference using summclust," Stata Journal, StataCorp LP, vol. 23(4), pages 942-982, December.
    8. James G. MacKinnon & Morten {O}rregaard Nielsen & Matthew D. Webb, 2024. "Cluster-robust jackknife and bootstrap inference for binary response models," Papers 2406.00650, arXiv.org.
    9. James G. MacKinnon, 2019. "How cluster‐robust inference is changing applied econometrics," Canadian Journal of Economics/Revue canadienne d'économique, John Wiley & Sons, vol. 52(3), pages 851-881, August.
    10. Djogbenou, Antoine A. & MacKinnon, James G. & Nielsen, Morten Ørregaard, 2019. "Asymptotic theory and wild bootstrap inference with clustered errors," Journal of Econometrics, Elsevier, vol. 212(2), pages 393-412.
    11. Wang, Wenjie & Zhang, Yichong, 2024. "Wild bootstrap inference for instrumental variables regressions with weak and few clusters," Journal of Econometrics, Elsevier, vol. 241(1).
    12. Ivan A. Canay & Andres Santos & Azeem M. Shaikh, 2018. "The wild bootstrap with a "small" number of "large" clusters," CeMMAP working papers CWP27/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    13. MacKinnon, James G., 2023. "Using large samples in econometrics," Journal of Econometrics, Elsevier, vol. 235(2), pages 922-926.
    14. James G. MacKinnon & Matthew D. Webb, 2017. "Pitfalls When Estimating Treatment Effects Using Clustered Data," Working Paper 1387, Economics Department, Queen's University.
    15. MacKinnon, James G., 2023. "Fast cluster bootstrap methods for linear regression models," Econometrics and Statistics, Elsevier, vol. 26(C), pages 52-71.
    16. Antoine A. Djogbenou & James G. MacKinnon & Morten Ø. Nielsen, 2017. "Validity Of Wild Bootstrap Inference With Clustered Errors," Working Paper 1383, Economics Department, Queen's University.
    17. Tom Boot & Gianmaria Niccodemi & Tom Wansbeek, 2023. "Unbiased estimation of the OLS covariance matrix when the errors are clustered," Empirical Economics, Springer, vol. 64(6), pages 2511-2533, June.
    18. Hwang, Jungbin, 2021. "Simple and trustworthy cluster-robust GMM inference," Journal of Econometrics, Elsevier, vol. 222(2), pages 993-1023.
    19. MacKinnon, James G. & Webb, Matthew D., 2020. "Randomization inference for difference-in-differences with few treated clusters," Journal of Econometrics, Elsevier, vol. 218(2), pages 435-450.
    20. Andreas Hagemann, 2019. "Permutation inference with a finite number of heterogeneous clusters," Papers 1907.01049, arXiv.org, revised Feb 2023.

    More about this item

    Keywords

    CRVE; grouped data; clustered data; cluster-robust variance estimator; robust inference; wild bootstrap; wild cluster bootstrap;
    All these keywords.

    JEL classification:

    • C15 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Statistical Simulation Methods: General
    • C21 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Cross-Sectional Models; Spatial Models; Treatment Effect Models
    • C23 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Models with Panel Data; Spatio-temporal Models

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:qed:wpaper:1428. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Mark Babcock (email available below). General contact details of provider: https://edirc.repec.org/data/qedquca.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.