Influence Analysis with Panel Data using Stata
Annalivia Polselli ()
Additional contact information
Annalivia Polselli: Institute for Analytics and Data Science (IADS), and Centre for Micro-Social Change (MiSoC), University of Essex
UK Stata Conference 2023 from Stata Users Group
Abstract:
The presence of units that possess extreme values in the dependent variable and/or independent variables (i.e., vertical outliers, good and bad leverage points) has the potential to severely bias least squares (LS) estimates – i.e., regression coefficients and/or standard errors. Diagnostic plots (such as leverage-versus-squared residual plots) and measures of overall influence (e.g., Cook (1979)’s distance) are usually used to detect such anomalies, but there are two different problems arising from their use. First, available commands for diagnostic plots are built for cross-sectional data, and some data manipulation is necessary for panel data. Second, Cook-like distances may fail to flag multiple anomalous cases in the data because they not account for pair-wise influence of observations (Atkinson, 1993; Chatterjee and Hadi, 1988, Rousseeuw, 1991; Rousseeuw and Van Zomeren, 1990, Lawrance, 1995). I overcome these limits as follows. First, I formalise statistical measures to quantify the degree of leverage and outlyingness of units in a panel data framework to produce diagnostic plots suitable for panel data. Second, I build on Lawrance (1995)'s pair-wise approach by proposing measures for joint and conditional influence suitable for panel data models with fixed effects. I develop a method to: (i) visually detect anomalous units in a panel data set and identify their type; (ii) investigate the effect of these units on LS estimates, and on other units’ influence. I propose two user-written commands in Stata to implement this method. xtlvr2plot produces a leverage-versus-residual plot suitable for panel data, and a summary table with the list of detected anomalous units and their type. xtinfluence calculates the joint and conditional influence and effects of pairs of units, and generates network-style plots (an option between scatter plot or heat plot is allowed by the command).
Date: 2023-09-10
New Economics Papers: this item is included in nep-ger
References: Add references at CitEc
Citations:
Downloads: (external link)
http://repec.org/lsug2023/Stata_UK23_Polselli.pdf
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:boc:lsug23:03
Access Statistics for this paper
More papers in UK Stata Conference 2023 from Stata Users Group Contact information at EDIRC.
Bibliographic data for series maintained by Christopher F Baum ().