[This post is based on the first slide (below) of a discussion of Helene Rey et al., which I gave a few days ago at a fine NBER IFM meeting (program and clickable papers here). The paper is fascinating and impressive, and I'll blog on it separately next time. But the slide below is more of a side rant on general issues, and I skipped it in the discussion of Rey et al. to be sure to have time to address their particular issues.]
Quite a while ago I blogged here on the ex ante expected loss minimization that underlies traditional econometric/statistical forecast combination, vs. the ex post regret minimization that underlies "online learning" and related "machine learning" methods. Nothing has changed. That is, as regards ex post regret minimization, I'm still intrigued, but I'm still not persuaded.
And there's another thing that bothers me. As implemented, ML-style online learning and traditional econometric-style forecast combination with time-varying parameters (TVPs) are almost identical: just projection (regression) of realizations on forecasts, reading off the combining weights as the regression coefficients. OF COURSE we can generalize to allow for time-varying combining weights, non-linear combinations, regularization in high dimensions, etc., and hundreds of econometrics papers have addressed and explored those issues. Yet the ML types seem to think they invented everything, and too many economists are buying it. Rey et al., for example, don't so much as mention the econometric forecast combination literature, which by now occupies large chapters of leading textbooks, like Elliott and Timmermann at the bottom of the slide below.
Showing posts with label Rants. Show all posts
Showing posts with label Rants. Show all posts
Sunday, October 27, 2019
Sunday, March 3, 2019
Standard Errors for Things that Matter
Many times in applied / empirical seminars I have seen something like this:
The paper estimates a parameter vector b and dutifully reports asymptotic s.e.'s. But then the ultimate object of interest turns out not to be b, but rather some nonlinear but continuous function of the elements of b, say c = f(b). So the paper calculates and reports an estimate of c as c_hat = f(b_hat). Fine, insofar as c_hat is consistent if b_hat is consistent. But then the paper forgets to calculate an asymptotic s.e. for c_hat.
So c is the object of interest, and hundreds, maybe thousands, of person-hours are devoted to producing a point estimate of c, but then no one remembers (cares?) to assess its estimation uncertainty. Geez. Of course one could do delta method, simulation, etc.
The paper estimates a parameter vector b and dutifully reports asymptotic s.e.'s. But then the ultimate object of interest turns out not to be b, but rather some nonlinear but continuous function of the elements of b, say c = f(b). So the paper calculates and reports an estimate of c as c_hat = f(b_hat). Fine, insofar as c_hat is consistent if b_hat is consistent. But then the paper forgets to calculate an asymptotic s.e. for c_hat.
So c is the object of interest, and hundreds, maybe thousands, of person-hours are devoted to producing a point estimate of c, but then no one remembers (cares?) to assess its estimation uncertainty. Geez. Of course one could do delta method, simulation, etc.
Thursday, March 8, 2018
H-Index for Journals
In an earlier rant, I suggested that journals move from tracking inane citation "impact factors" to citation "H indexes" or similar, just as routinely done when evaluating individual authors. It turns out that RePEc already does it, here. There are literally many thousands of journals ranked. I show the top 25 below. Interestingly, four "field" journals actually make the top 10, effectively making them "super (uber?) field journals" (J. Finance, J. Financial Economics, J. Monetary Economics, and J. Econometrics). For example, J. Econometrics is basically indistinguishable from Review of Economic Studies.
The rankings
Sunday, October 29, 2017
What's up With "Fintech"?
It's been a while, so it's time for a rant (in this case gentle, with no names named).
Discussion of financial technology ("fintech", as it's called) seems to be everywhere these days, from business school fintech course offerings to high-end academic fintech research conferences. I definitely get the business school thing -- tech is cool with students now, and finance is cool with students now, and there are lots of high-paying jobs.
But I'm not sure I get the academic research thing. We can talk about "X-tech" for almost unlimited X: shopping, travel, learning, medicine, construction, sailing, ..., and yes, finance. It's all interesting, but is there something extra interesting about X=finance that elevates fintech to a higher level? Or elevates it to a serious and separate new research area? If there is, I don't know what it is, notwithstanding the cute name and all the recent publicity.
(Some earlier rants appear to the right, under Browse by Topic / Rants.)
Discussion of financial technology ("fintech", as it's called) seems to be everywhere these days, from business school fintech course offerings to high-end academic fintech research conferences. I definitely get the business school thing -- tech is cool with students now, and finance is cool with students now, and there are lots of high-paying jobs.
But I'm not sure I get the academic research thing. We can talk about "X-tech" for almost unlimited X: shopping, travel, learning, medicine, construction, sailing, ..., and yes, finance. It's all interesting, but is there something extra interesting about X=finance that elevates fintech to a higher level? Or elevates it to a serious and separate new research area? If there is, I don't know what it is, notwithstanding the cute name and all the recent publicity.
(Some earlier rants appear to the right, under Browse by Topic / Rants.)
Monday, September 4, 2017
More on New p-Value Thresholds
I recently blogged on a new proposal heavily backed by elite statisticians to "redefine statistical significance", forthcoming in the elite journal Nature Human Behavior. (A link to the proposal appears at the end of this post.)
I have a bit more to say. It's not just that I find the proposal counterproductive; I have to admit that I also find it annoying, bordering on offensive.
I find it inconceivable that the authors' p<.005 recommendation will affect their own behavior, or that of others like them. They're all skilled statisticians, hardly so naive as to declare a "discovery" simply because a p-value does or doesn't cross a magic threshold, whether .05 or .005. Serious evaluations and interpretations of statistical analyses by serious statisticians are much more nuanced and rich -- witness the extended and often-heated discussion in any good applied statistics seminar.
If the p<.005 threshold won't change the behavior of skilled statisticians, then whose behavior MIGHT it change? That is, reading between the lines, to whom is the proposal REALLY addressed? Evidently those much less skilled, the proverbial "practitioners", who the authors evidently hope to keep out of trouble by providing a rule of thumb that can at least be followed mechanically.
How patronizing.
------
Redefine Statistical Significance
I have a bit more to say. It's not just that I find the proposal counterproductive; I have to admit that I also find it annoying, bordering on offensive.
I find it inconceivable that the authors' p<.005 recommendation will affect their own behavior, or that of others like them. They're all skilled statisticians, hardly so naive as to declare a "discovery" simply because a p-value does or doesn't cross a magic threshold, whether .05 or .005. Serious evaluations and interpretations of statistical analyses by serious statisticians are much more nuanced and rich -- witness the extended and often-heated discussion in any good applied statistics seminar.
If the p<.005 threshold won't change the behavior of skilled statisticians, then whose behavior MIGHT it change? That is, reading between the lines, to whom is the proposal REALLY addressed? Evidently those much less skilled, the proverbial "practitioners", who the authors evidently hope to keep out of trouble by providing a rule of thumb that can at least be followed mechanically.
How patronizing.
------
Redefine Statistical Significance
Date: 2017
By:
Daniel Benjamin ; James Berger ; Magnus Johannesson ; Brian Nosek ; E. Wagenmakers ; Richard Berk ; Kenneth Bollen ; Bjorn Brembs ; Lawrence Brown ; Colin Camerer ; David Cesarini ; Christopher Chambers ; Merlise Clyde ; Thomas Cook ; Paul De Boeck ; Zoltan Dienes ; Anna Dreber ; Kenny Easwaran ; Charles Efferson ; Ernst Fehr ; Fiona Fidler ; Andy Field ; Malcom Forster ; Edward George ; Tarun Ramadorai ; Richard Gonzalez ; Steven Goodman ; Edwin Green ; Donald Green ; Anthony Greenwald ; Jarrod Hadfield ; Larry Hedges ; Leonhard Held ; Teck Hau Ho ; Herbert Hoijtink ; James Jones ; Daniel Hruschka ; Kosuke Imai ; Guido Imbens ; John Ioannidis ; Minjeong Jeon ; Michael Kirchler ; David Laibson ; John List ; Roderick Little ; Arthur Lupia ; Edouard Machery ; Scott Maxwell; Michael McCarthy ; Don Moore ; Stephen Morgan ; Marcus Munafo ; Shinichi Nakagawa ; Brendan Nyhan ; Timothy Parker ; Luis Pericchi; Marco Perugini ; Jeff Rouder ; Judith Rousseau ; Victoria Savalei ; Felix Schonbrodt ; Thomas Sellke ; Betsy Sinclair ; Dustin Tingley; Trisha Zandt ; Simine Vazire ; Duncan Watts; Christopher Winship ; Robert Wolpert ; Yu Xie; Cristobal Young ; Jonathan Zinman ; Valen Johnson
Abstract: We propose to change the default P-value threshold for statistical significance for claims of new discoveries from 0.05 to 0.005.
http://d.repec.org/n?u=RePEc:feb:artefa:00612&r=ecm
By:
Daniel Benjamin ; James Berger ; Magnus Johannesson ; Brian Nosek ; E. Wagenmakers ; Richard Berk ; Kenneth Bollen ; Bjorn Brembs ; Lawrence Brown ; Colin Camerer ; David Cesarini ; Christopher Chambers ; Merlise Clyde ; Thomas Cook ; Paul De Boeck ; Zoltan Dienes ; Anna Dreber ; Kenny Easwaran ; Charles Efferson ; Ernst Fehr ; Fiona Fidler ; Andy Field ; Malcom Forster ; Edward George ; Tarun Ramadorai ; Richard Gonzalez ; Steven Goodman ; Edwin Green ; Donald Green ; Anthony Greenwald ; Jarrod Hadfield ; Larry Hedges ; Leonhard Held ; Teck Hau Ho ; Herbert Hoijtink ; James Jones ; Daniel Hruschka ; Kosuke Imai ; Guido Imbens ; John Ioannidis ; Minjeong Jeon ; Michael Kirchler ; David Laibson ; John List ; Roderick Little ; Arthur Lupia ; Edouard Machery ; Scott Maxwell; Michael McCarthy ; Don Moore ; Stephen Morgan ; Marcus Munafo ; Shinichi Nakagawa ; Brendan Nyhan ; Timothy Parker ; Luis Pericchi; Marco Perugini ; Jeff Rouder ; Judith Rousseau ; Victoria Savalei ; Felix Schonbrodt ; Thomas Sellke ; Betsy Sinclair ; Dustin Tingley; Trisha Zandt ; Simine Vazire ; Duncan Watts; Christopher Winship ; Robert Wolpert ; Yu Xie; Cristobal Young ; Jonathan Zinman ; Valen Johnson
Abstract: We propose to change the default P-value threshold for statistical significance for claims of new discoveries from 0.05 to 0.005.
http://d.repec.org/n?u=RePEc:feb:artefa:00612&r=ecm
Sunday, August 27, 2017
New p-Value Thresholds for Statistical Significance
This is presently among the hottest topics / discussions / developments in statistics. Seriously. Just look at the abstract and dozens of distinguished authors of the paper below, which is forthcoming in one of the world's leading science outlets, Nature Human Behavior.
Of course data mining, or overfitting, or whatever you want to call it, has always been a problem, warranting strong and healthy skepticism regarding alleged "new discoveries". But the whole point of examining p-values is to AVOID anchoring on arbitrary significance thresholds, whether the old magic .05 or the newly-proposed magic .005. Just report the p-value, and let people decide for themselves how they feel. Why obsess over asterisks, and whether/when to put them next to things?
Postscript:
Reading the paper, which I had not done before writing the paragraph above (there's largely no need, as the wonderfully concise abstract says it all), I see that it anticipates my objection at the end of a section entitled "potential objections":
The paper offers only a feeble refutation of that "potential" objection:
Of course data mining, or overfitting, or whatever you want to call it, has always been a problem, warranting strong and healthy skepticism regarding alleged "new discoveries". But the whole point of examining p-values is to AVOID anchoring on arbitrary significance thresholds, whether the old magic .05 or the newly-proposed magic .005. Just report the p-value, and let people decide for themselves how they feel. Why obsess over asterisks, and whether/when to put them next to things?
Postscript:
Reading the paper, which I had not done before writing the paragraph above (there's largely no need, as the wonderfully concise abstract says it all), I see that it anticipates my objection at the end of a section entitled "potential objections":
Changing the significance threshold is a distraction from the real solution, which is to replace null hypothesis significance testing (and bright-line thresholds) with more focus on effect sizes and confidence intervals, treating the P-value as a continuous measure, and/or a Bayesian method.Here here! Marvelously well put.
The paper offers only a feeble refutation of that "potential" objection:
Many of us agree that there are better approaches to statistical analyses than null hypothesis significance testing, but as yet there is no consensus regarding the appropriate choice of replacement. ... Even after the significance threshold is changed, many of us will continue to advocate for alternatives to null hypothesis significance testing.I'm all for advocating alternatives to significance testing. That's important and helpful. As for continuing to promulgate significance testing with magic significance thresholds, whether .05 or .005, well, you can decide for yourself.
Redefine Statistical Significance
Sunday, February 19, 2017
Econometrics: Angrist and Pischke are at it Again
Check out the new Angrist-Pischke (AP), "Undergraduate Econometrics Instruction: Through Our Classes, Darkly".
I guess I have no choice but to weigh in. The issues are important, and my earlier AP post, "Mostly Harmless Econometrics?", is my all-time most popular.
Basically AP want all econometrics texts to look a lot more like theirs. But their books and their new essay unfortunately miss (read: dismiss) half of econometrics.
Here's what AP get right:
(Goal G1) One of the major goals in econometrics is predicting the effects of exogenous "treatments" or "interventions" or "policies". Phrased in the language of estimation, the question is "If I intervene and give someone a certain treatment \({\partial x}, x \in X\), what is my minimum-MSE estimate of her \(\ \partial y\)?" So we are estimating the partial derivative \({\partial y / \partial x}\).
AP argue the virtues and trumpet the successes of a "design-based" approach to G1. In my view they make many good points as regards G1: discontinuity designs, dif-in-dif designs, and other clever modern approaches for approximating random experiments indeed take us far beyond "Stones'-age" approaches to G1. (AP sure turn a great phrase...). And the econometric simplicity of the design-based approach is intoxicating: it's mostly just linear regression of \(y\) on \(x\) and a few cleverly-chosen control variables -- you don't need a full model -- with White-washed standard errors. Nice work if you can get it. And yes, moving forward, any good text should feature a solid chapter on those methods.
Here's what AP miss/dismiss:
(Goal G2) The other major goal in econometrics is predicting \(y\). In the language of estimation, the question is "If a new person \(i\) arrives with covariates \(X_i\), what is my minimum-MSE estimate of her \(y_i\)? So we are estimating a conditional mean \(E(y | X) \), which in general is very different from estimating a partial derivative \({\partial y / \partial x}\).
The problem with the AP paradigm is that it doesn't work for goal G2. Modeling nonlinear functional form is important, as the conditional mean function \(E(y | X) \) may be highly nonlinear in \(X\); systematic model selection is important, as it's not clear a priori what subset of \(X\) (i.e., what model) might be most useful for approximating \(E(y | X) \); detecting and modeling heteroskedasticity is important (in both cross sections and time series), as it's the key to accurate interval and density prediction; detecting and modeling serial correlation is crucially important in time-series contexts, as "the past" is the key conditioning information for predicting "the future"; etc., etc, ...
I guess I have no choice but to weigh in. The issues are important, and my earlier AP post, "Mostly Harmless Econometrics?", is my all-time most popular.
Basically AP want all econometrics texts to look a lot more like theirs. But their books and their new essay unfortunately miss (read: dismiss) half of econometrics.
Here's what AP get right:
(Goal G1) One of the major goals in econometrics is predicting the effects of exogenous "treatments" or "interventions" or "policies". Phrased in the language of estimation, the question is "If I intervene and give someone a certain treatment \({\partial x}, x \in X\), what is my minimum-MSE estimate of her \(\ \partial y\)?" So we are estimating the partial derivative \({\partial y / \partial x}\).
AP argue the virtues and trumpet the successes of a "design-based" approach to G1. In my view they make many good points as regards G1: discontinuity designs, dif-in-dif designs, and other clever modern approaches for approximating random experiments indeed take us far beyond "Stones'-age" approaches to G1. (AP sure turn a great phrase...). And the econometric simplicity of the design-based approach is intoxicating: it's mostly just linear regression of \(y\) on \(x\) and a few cleverly-chosen control variables -- you don't need a full model -- with White-washed standard errors. Nice work if you can get it. And yes, moving forward, any good text should feature a solid chapter on those methods.
Here's what AP miss/dismiss:
(Goal G2) The other major goal in econometrics is predicting \(y\). In the language of estimation, the question is "If a new person \(i\) arrives with covariates \(X_i\), what is my minimum-MSE estimate of her \(y_i\)? So we are estimating a conditional mean \(E(y | X) \), which in general is very different from estimating a partial derivative \({\partial y / \partial x}\).
The problem with the AP paradigm is that it doesn't work for goal G2. Modeling nonlinear functional form is important, as the conditional mean function \(E(y | X) \) may be highly nonlinear in \(X\); systematic model selection is important, as it's not clear a priori what subset of \(X\) (i.e., what model) might be most useful for approximating \(E(y | X) \); detecting and modeling heteroskedasticity is important (in both cross sections and time series), as it's the key to accurate interval and density prediction; detecting and modeling serial correlation is crucially important in time-series contexts, as "the past" is the key conditioning information for predicting "the future"; etc., etc, ...
(Notice how often "model" and "modeling" appear in the above paragraph. That's precisely what AP dismiss, even in their abstract, which very precisely, and incorrectly, declares that "Applied econometrics ...[now prioritizes]... the estimation of specific causal effects and empirical policy analysis over general models of outcome determination".)
The AP approach to goal G2 is to ignore it, in a thinly-veiled attempt to equate econometrics exclusively with G1, which nicely feathers the AP nest. Sorry guys, but no one's buying it. That's why the textbooks continue to feature G2 tools and techniques so prominently, as well they should.
The AP approach to goal G2 is to ignore it, in a thinly-veiled attempt to equate econometrics exclusively with G1, which nicely feathers the AP nest. Sorry guys, but no one's buying it. That's why the textbooks continue to feature G2 tools and techniques so prominently, as well they should.
Thursday, November 3, 2016
StatPrize
Check out this new prize, http://statprize.org/ (Thanks, Dave Giles, for informing me via your tweet.) It should be USD 1 Million, ahead of the Nobel, as statistics is a key part (arguably the key part) of the foundation on which every science builds.
And obviously check out David Cox, the first winner. Every time I've given an Oxford econometrics seminar, he has shown up. It's humbling that he evidently thinks he might have something to learn from me. What an amazing scientist, and what an amazing gentleman.
And also obviously, the new StatPrize can't help but remind me of Ted Anderson's recent passing, not to mention the earlier but recent passings, for example, of Herman Wold, Edmond Mallinvaud, and Arnold Zellner. Wow -- sometimes the Stockholm gears just grind too slowly. Moving forward, StatPrize will presumably make such econometric recognition failures less likely.
And obviously check out David Cox, the first winner. Every time I've given an Oxford econometrics seminar, he has shown up. It's humbling that he evidently thinks he might have something to learn from me. What an amazing scientist, and what an amazing gentleman.
And also obviously, the new StatPrize can't help but remind me of Ted Anderson's recent passing, not to mention the earlier but recent passings, for example, of Herman Wold, Edmond Mallinvaud, and Arnold Zellner. Wow -- sometimes the Stockholm gears just grind too slowly. Moving forward, StatPrize will presumably make such econometric recognition failures less likely.
Tuesday, September 20, 2016
On "Shorter Papers"
Journals should not corral shorter papers into sections like "Shorter Papers". Doing so sends a subtle (actually unsubtle) message that shorter papers are basically second-class citizens, somehow less good, or less important, or less something -- not just less long -- than longer papers. If a paper is above the bar, then it's above the bar, and regardless of its length it should then be published simply as a paper, not a "shorter paper", or a "note", or anything else. Many shorter papers are much more important than the vast majority of longer papers.
Tuesday, September 6, 2016
Inane Journal "Impact Factors"
Why are journals so obsessed with "impact factors"? (The five-year impact factor is average citations/article in a five-year window.) They're often calculated to three decimal places, and publishers trumpet victory when they go from (say) 1.225 to 1.311! It's hard to think of a dumber statistic, or dumber over-interpretation. Are the numbers after the decimal point anything more than noise, and for that matter, are the numbers before the decimal much more than noise?
Why don't journals instead use the same citation indexes used for individuals? The leading index seems to be the h-index, which is the largest integer h such that an individual has h papers, each cited at least h times. I don't know who cooked up the h-index, and surely it has issues too, but the gurus love it, and in my experience it tells the truth.
Even better, why not stop obsessing over clearly-insufficient statistics of any kind? I propose instead looking at what I'll call a "citation signature plot" (CSP), simply plotting the number of cites for the most-cited paper, the number of cites for the second-most-cited paper, and so on. (Use whatever window(s) you want.) The CSP reveals everything, instantly and visually. How high is the CSP for the top papers? How quickly, and with what pattern, does it approach zero? etc., etc. It's all there.
Google-Scholar CSP's are easy to make for individuals, and they're tremendously informative. They'd be only slightly harder to make for journals. I'd love to see some.
Why don't journals instead use the same citation indexes used for individuals? The leading index seems to be the h-index, which is the largest integer h such that an individual has h papers, each cited at least h times. I don't know who cooked up the h-index, and surely it has issues too, but the gurus love it, and in my experience it tells the truth.
Even better, why not stop obsessing over clearly-insufficient statistics of any kind? I propose instead looking at what I'll call a "citation signature plot" (CSP), simply plotting the number of cites for the most-cited paper, the number of cites for the second-most-cited paper, and so on. (Use whatever window(s) you want.) The CSP reveals everything, instantly and visually. How high is the CSP for the top papers? How quickly, and with what pattern, does it approach zero? etc., etc. It's all there.
Google-Scholar CSP's are easy to make for individuals, and they're tremendously informative. They'd be only slightly harder to make for journals. I'd love to see some.
Wednesday, October 28, 2015
The HAC Emperor has no Clothes
Well, at least in time-series settings. (I'll save cross sections for a later post.)
Consider a time-series regression with possibly heteroskedastic and/or autocorrelated disturbances,
Punting via kernel-HAC estimation is a bad idea in time series, for several reasons:
(1) [Kernel-HAC is not likely to produce good \(\beta\) estimates.] It stays with OLS and hence gives up on efficient estimation of \(\hat{\beta}\). In huge samples the efficiency loss from using OLS rather than GLS/ML is likely negligible, but time-series samples are often smallish. For example, samples like 1960Q1-2014Q4 are typical in macroeconomics -- just a couple hundred observations of highly-serially-correlated data.
(2) [Kernel-HAC is not likely to produce good \(\beta\) inference.] Its standard errors are not tailored to a specific parametric approximation to \(\varepsilon\) dynamics. Proponents will quickly counter that that's a benefit, not a cost, and in some settings the proponents may be correct. But not in time series settings. In time series, \(\varepsilon\) dynamics are almost always accurately and parsimoniously approximated parametrically (ARMA for conditional mean dynamics in \(\varepsilon\), and GARCH for conditional variance dynamics in \(\varepsilon\)). Hence kernel-HAC standard errors may be unnecessarily unreliable in small samples, even if they're accurate asymptotically. And again, time-series sample sizes are often smallish.
(3) [Most crucially, kernel-HAC fails to capture invaluable predictive information.] Time series econometrics is intimately concerned with prediction, and explicit parametric modeling of dynamic heteroskedasticity and autocorrelation in \(\varepsilon\) can be used for improved prediction of \(y\). Autocorrelation can be exploited for improved point prediction, and dynamic conditional heteroskedasticity can be exploited for improved interval and density prediction. Punt on them and you're potentially leaving a huge amount of money on the table.
The clearly preferable approach is traditional parametric disturbance heteroskedasticty / autocorrelation modeling, with GLS/ML estimation. Simply allow for ARMA(p,q)-GARCH(P,Q) disturbances (say), with p,q, P and Q selected by AIC (say). (In many applications something like AR(3)-GARCH(1,1) or ARMA(1,1)-GARCH(1,1) would be more than adequate.) Note that the traditional approach is actually fully non-parametric when appropriately viewed as a sieve, and moreover it features automatic bandwidth selection.
Kernel-HAC people call the traditional strategy "pre-whitening," to be done prior to kernel-HAC estimation. But the real point is that it's all -- or at least mostly all -- in the pre-whitening.
In closing, I might add that the view expressed here is strongly supported by top-flight research. On my point (2) and my general recommendation, for example, see the insightful work of den Haan and Levin (2000). It fell on curiously deaf ears and remains unpublished many years later. (It's on Wouter den Haan's web site in a section called "Sleeping and Hard to Get"!) In the interim much of the world jumped on the kernel-HAC bandwagon. It's time to jump off.
Consider a time-series regression with possibly heteroskedastic and/or autocorrelated disturbances,
\( y_t = x_t' \beta + \varepsilon_t \).
A popular approach is to punt on the potentially non-iid disturbance, instead simply running OLS with kernel-based heteroskedasticity and autocorrelation consistent (HAC) standard errors.Punting via kernel-HAC estimation is a bad idea in time series, for several reasons:
(1) [Kernel-HAC is not likely to produce good \(\beta\) estimates.] It stays with OLS and hence gives up on efficient estimation of \(\hat{\beta}\). In huge samples the efficiency loss from using OLS rather than GLS/ML is likely negligible, but time-series samples are often smallish. For example, samples like 1960Q1-2014Q4 are typical in macroeconomics -- just a couple hundred observations of highly-serially-correlated data.
(2) [Kernel-HAC is not likely to produce good \(\beta\) inference.] Its standard errors are not tailored to a specific parametric approximation to \(\varepsilon\) dynamics. Proponents will quickly counter that that's a benefit, not a cost, and in some settings the proponents may be correct. But not in time series settings. In time series, \(\varepsilon\) dynamics are almost always accurately and parsimoniously approximated parametrically (ARMA for conditional mean dynamics in \(\varepsilon\), and GARCH for conditional variance dynamics in \(\varepsilon\)). Hence kernel-HAC standard errors may be unnecessarily unreliable in small samples, even if they're accurate asymptotically. And again, time-series sample sizes are often smallish.
The clearly preferable approach is traditional parametric disturbance heteroskedasticty / autocorrelation modeling, with GLS/ML estimation. Simply allow for ARMA(p,q)-GARCH(P,Q) disturbances (say), with p,q, P and Q selected by AIC (say). (In many applications something like AR(3)-GARCH(1,1) or ARMA(1,1)-GARCH(1,1) would be more than adequate.) Note that the traditional approach is actually fully non-parametric when appropriately viewed as a sieve, and moreover it features automatic bandwidth selection.
Kernel-HAC people call the traditional strategy "pre-whitening," to be done prior to kernel-HAC estimation. But the real point is that it's all -- or at least mostly all -- in the pre-whitening.
In closing, I might add that the view expressed here is strongly supported by top-flight research. On my point (2) and my general recommendation, for example, see the insightful work of den Haan and Levin (2000). It fell on curiously deaf ears and remains unpublished many years later. (It's on Wouter den Haan's web site in a section called "Sleeping and Hard to Get"!) In the interim much of the world jumped on the kernel-HAC bandwagon. It's time to jump off.
Sunday, January 11, 2015
Mostly Harmless Econometrics?
I've had Angrist-Pischke's Mostly Harmless Econometrics: An Empiricist's Companion (MHE) for a while, but I just got around to reading it. (By the way, a lower-level follow-up was just published.)
There's a lot to like about MHE. It's an insightful and fun treatment of micro-econometric regression-based causal effect estimation -- basically how to (try to) tease causal information from least-squares regressions fit to observational micro data. It's filled with wisdom, exploring many subtleties and nuances. In many ways it's written not for students at age 23, but rather for seasoned researchers at age 53. And it tells its story in a marvelously engaging conversational style.
But there's also a lot not to like about MHE. The problem isn't what it includes, but rather what it excludes. Starting with its title and continuing throughout, MHE promotes its corner of applied econometrics as all of applied econometrics, or at least all of the "mostly harmless" part (whatever that means). Hence it effectively condemns much of the rest as "harmful," and sentences it to death by neglect. It gives the silent treatment, for example, to anything structural -- whether micro-econometric or macro-econometric -- and anything involving time series. And in the rare instances when silence is briefly broken, we're treated to gems like "serial correlation [until recently was] Somebody Else's Problem, specifically the unfortunate souls who make their living out of time series data (macroeconomists, for example)" (pp. 315-316).
[Here's a rough parallel. Consider Hansen and Sargent's Recursive Models of Dynamic Linear Economies. It treats structural analysis and econometric estimation of dynamic macroeconomic models, and it naturally contains large doses of time series, state space, optimal filtering, etc. It's also appropriately titled and appropriately pitched. Now imagine that Hansen and Sargent had instead titled it Mostly Harmless Econometrics, declared its contents to be the central part of (the mostly harmless part of) applied econometrics, and pitched it as a general "empiricist's companion". Voilà !]
All told, Mostly Harmless Econometrics: An Empiricist's Companion is neither "mostly harmless" nor an "empiricist's companion." Rather, it's a companion for a highly-specialized group of applied non-structural micro-econometricians hoping to estimate causal effects using non-experimental data and largely-static, linear, regression-based methods. It's a novel treatment of that sub-sub-sub-area of applied econometrics, but pretending to be anything more is most definitely harmful, particularly to students, who have no way to recognize the charade as a charade.
There's a lot to like about MHE. It's an insightful and fun treatment of micro-econometric regression-based causal effect estimation -- basically how to (try to) tease causal information from least-squares regressions fit to observational micro data. It's filled with wisdom, exploring many subtleties and nuances. In many ways it's written not for students at age 23, but rather for seasoned researchers at age 53. And it tells its story in a marvelously engaging conversational style.
But there's also a lot not to like about MHE. The problem isn't what it includes, but rather what it excludes. Starting with its title and continuing throughout, MHE promotes its corner of applied econometrics as all of applied econometrics, or at least all of the "mostly harmless" part (whatever that means). Hence it effectively condemns much of the rest as "harmful," and sentences it to death by neglect. It gives the silent treatment, for example, to anything structural -- whether micro-econometric or macro-econometric -- and anything involving time series. And in the rare instances when silence is briefly broken, we're treated to gems like "serial correlation [until recently was] Somebody Else's Problem, specifically the unfortunate souls who make their living out of time series data (macroeconomists, for example)" (pp. 315-316).
[Here's a rough parallel. Consider Hansen and Sargent's Recursive Models of Dynamic Linear Economies. It treats structural analysis and econometric estimation of dynamic macroeconomic models, and it naturally contains large doses of time series, state space, optimal filtering, etc. It's also appropriately titled and appropriately pitched. Now imagine that Hansen and Sargent had instead titled it Mostly Harmless Econometrics, declared its contents to be the central part of (the mostly harmless part of) applied econometrics, and pitched it as a general "empiricist's companion". Voilà !]
All told, Mostly Harmless Econometrics: An Empiricist's Companion is neither "mostly harmless" nor an "empiricist's companion." Rather, it's a companion for a highly-specialized group of applied non-structural micro-econometricians hoping to estimate causal effects using non-experimental data and largely-static, linear, regression-based methods. It's a novel treatment of that sub-sub-sub-area of applied econometrics, but pretending to be anything more is most definitely harmful, particularly to students, who have no way to recognize the charade as a charade.
Tuesday, October 21, 2014
Rant: Academic "Letterhead" Requirements
(All rants, including this one, are here.)
Countless times, from me to Chair/Dean xxx at Some Other University:
Countless times, from me to Chair/Dean xxx at Some Other University:
I am happy to help with your evaluation of Professor zzz. This email will serve as my letter. [email here]...Countless times, from Chair/Dean xxx to me:
Thanks very much for your thoughtful evaluation. Can you please put it on your university letterhead and re-send?Fantasy response from me to Chair/Dean xxx:
Sure, no problem at all. My time is completely worthless, so I'm happy to oblige, despite the fact that email conveys precisely the same information and is every bit as legally binding (whatever that even means in this context) as a "signed" "letter" on "letterhead." So now I’ll copy my email, try to find some dusty old Word doc letterhead on my hard drive, paste the email into the Word doc, try to beat it into submission depending on how poor the formatting / font / color / blocking looks when first pasted, print from Word to pdf, attach the pdf to a new email, and re-send it to you. How 1990’s.Actually last week I did send something approximating the fantasy email to a dean at a leading institution. I suspect that he didn't find it amusing. (I never heard back.) But as I also said at the end of that email,
"Please don’t be annoyed. I...know that these sorts of 'requirements' have nothing to do with you per se. Instead I’m just trying to push us both forward in our joint battle with red tape."
Monday, August 11, 2014
On Rude and Risky "Calls for Papers"
You have likely seen calls for papers that include this script, or something similar:
Bad form, folks.
(1) It's rude. Submissions are not spam to be acted upon by the organizers if interesting, and deleted otherwise. On the contrary, they're solicited, so the least the organizer can do is acknowledge receipt and outcome with costless "thanks for your submission" and "sorry but we couldn't use your paper" emails (which, by the way, are automatically sent in leading software like Conference Maker). As for gratuitous additions like "They are not journal editors or program committee chairmen...," well, I'll hold my tongue.
(2) It's risky. Consider an author whose fine submission somehow fails to reach the organizer, which happens surprisingly often. The lost opportunity hurts everyone -- the author whose career would have been enhanced, the organizer whose reputation would have been enhanced, and the conference participants whose knowledge would have been enhanced, not to mention the general advancement of science -- and no one is the wiser. That doesn't happen when the announced procedure includes acknowledgement of submissions, in which case the above author would simply email the organizer saying, "Hey, where's my acknowledgement? Didn't you receive my submission?"
(Note the interplay between (1) and (2). Social norms like "courtesy" arise in part to promote efficiency.)
You will not hear from the organizers unless they decide to use your paper.It started with one leading group's calls, which go so even farther:
You will not hear from the organizers unless they decide to use your paper. They are not journal editors or program committee chairmen for a society.Now it's spreading.
Bad form, folks.
(1) It's rude. Submissions are not spam to be acted upon by the organizers if interesting, and deleted otherwise. On the contrary, they're solicited, so the least the organizer can do is acknowledge receipt and outcome with costless "thanks for your submission" and "sorry but we couldn't use your paper" emails (which, by the way, are automatically sent in leading software like Conference Maker). As for gratuitous additions like "They are not journal editors or program committee chairmen...," well, I'll hold my tongue.
(2) It's risky. Consider an author whose fine submission somehow fails to reach the organizer, which happens surprisingly often. The lost opportunity hurts everyone -- the author whose career would have been enhanced, the organizer whose reputation would have been enhanced, and the conference participants whose knowledge would have been enhanced, not to mention the general advancement of science -- and no one is the wiser. That doesn't happen when the announced procedure includes acknowledgement of submissions, in which case the above author would simply email the organizer saying, "Hey, where's my acknowledgement? Didn't you receive my submission?"
(Note the interplay between (1) and (2). Social norms like "courtesy" arise in part to promote efficiency.)
Sunday, March 2, 2014
Double-"Blind" Refereeing is Misguided in Principle and a Charade in Practice
I view the title of this post as almost self-evident. But lots of do-gooders out there disagree, touting double-blind refereeing as somehow promoting "fairness."
(1) Misguided in principle: One never makes more-informed decisions (or predictions, or inferences, or whatever) by shrinking the information on which they're based. It's that simple.
(2) A charade in practice: It now rarely takes more than a few seconds to identify the author of a "blinded" manuscript. And on the rare occasions when Google can't nail it instantly, the author usually helps in various ways, such as by over-citing himself (whether innocently or strategically).
Whenever I'm requested to review a blinded manuscript, I decline immediately. I simply refuse to participate in the charade. You should too.
(1) Misguided in principle: One never makes more-informed decisions (or predictions, or inferences, or whatever) by shrinking the information on which they're based. It's that simple.
(2) A charade in practice: It now rarely takes more than a few seconds to identify the author of a "blinded" manuscript. And on the rare occasions when Google can't nail it instantly, the author usually helps in various ways, such as by over-citing himself (whether innocently or strategically).
Whenever I'm requested to review a blinded manuscript, I decline immediately. I simply refuse to participate in the charade. You should too.
Saturday, June 1, 2013
Blogging Environments
Speaking of ideal environments, how could I have forgotten blogging environments in my last post?
Not that I really know much. And not that you care. Anyway, a typical recent Diebold family exchange says it all:
Daughter 1: Dad, why are you doing a stupid blog?
Me: I'm not sure. And it's not stupid. And thanks for your support.
Daughter 1: And why are you doing your stupid blog using stupid Google Blogger?
Me: I'm not sure. And thanks for your support.
Daughter 1: No cool bloggers use Blogger. Only old people who don't know any better. You should use WordPress.
Me: What are you talking about? What's WordPress? All my blogging friends use Blogger.
Daughter 1: They're all old people who don't know any better.
Daughter 2: Don't listen to her, Dad -- Blogger is perfectly fine and totally cool.
Wife: What's a blog?
In my experience regarding e-matters, the youngest is generally right. So Daughter 2 (age 14) wins -- for now I'm staying with Blogger. I'm delighted to learn that Blogger is cool, and that by implication I'm cool.
Not that I really know much. And not that you care. Anyway, a typical recent Diebold family exchange says it all:
Daughter 1: Dad, why are you doing a stupid blog?
Me: I'm not sure. And it's not stupid. And thanks for your support.
Daughter 1: And why are you doing your stupid blog using stupid Google Blogger?
Me: I'm not sure. And thanks for your support.
Daughter 1: No cool bloggers use Blogger. Only old people who don't know any better. You should use WordPress.
Me: What are you talking about? What's WordPress? All my blogging friends use Blogger.
Daughter 1: They're all old people who don't know any better.
Daughter 2: Don't listen to her, Dad -- Blogger is perfectly fine and totally cool.
Wife: What's a blog?
In my experience regarding e-matters, the youngest is generally right. So Daughter 2 (age 14) wins -- for now I'm staying with Blogger. I'm delighted to learn that Blogger is cool, and that by implication I'm cool.
Subscribe to:
Posts (Atom)