I want to briefly point out what this blog is NOT saying, either explicitly or implicitly. This is important because a lot of people react to my statements about GASSSPP and its production of junk science as if I’m talking about them, not the research. That is most definitely not the case.
So first, I am NOT saying that management researchers are incompetent. “OK then,” goes a frequent response, “how can you make these very strong statements about the errors in the research and then say you think we know what we’re doing? That makes no sense!”
The problem isn’t the researcher, but the training that most of them have been given. The technical understanding of basic probability concepts and the construction of tests and models to evaluate data are all correctly taught in the large majority of doctoral programs and statistics texts. The interpretation of these results, however, especially the p level, is a different matter—this is something that many texts simply get wrong, and this has been noted by a number of commentators on GASSSPP research (Carver, 1978; Edwards, 2008; Hubbard & Armstrong, 2006; Huberty, 1993; McCloskey & Ziliak, 1996). The problem is so bad that Hubbard & Armstrong (2006) contend that “most marketing researchers don’t know what p means” and that this is “an educational failure of the first order.” So the quality of our purely technical statistical training is undone by the incorrect training in what statistical outcomes mean; and most researchers have been trained incorrectly, in that very narrow but important respect. Since this condition afflicts the majority of contributors, reviewers, and editors in the field, everyone emulates everyone else and it is now self-reinforcing. The irony is that if we could persuade a number of editors, and perhaps several professional organizations, to insist on the correct evaluation of statistical results, the statistics and methods researchers already know (except misinterpreting p) comprise all the tools we need to do better research. We don’t have to throw it all out and start again.
I am also NOT saying that I consider researchers to be careless, lazy, sloppy, unethical, cynical, or otherwise personally responsible for the negative attributes of GASSSPP work. I see how hard people work on their studies and how involved they get, and questions about their motivations are not an issue. This is especially true of junior faculty, who having worked long and hard to get a PhD and a teaching job, now have to produce hits or call it quits—the choice is that stark at all of the Tier 1 research b-schools and aspirants.
But I also have to add a caveat here. There have always been the people who game the system (Anonymous, 1987), and the quality of our work has suffered as a result. But Bedeian’s recent (2010) article literally sent chills down my spine—while he diplomatically worded his questions to ask whether respondents “knew of” anyone who committed the various types of ethical and professional misconduct within the past year that he surveyed, I expected to see very few cases where people “knew of” faking data or manipulating data. But the proportions of such behavior that he reports is very worrisome. In my opinion, the ridiculous pressure to produce hits inevitably has led otherwise ethical people to do things like this, and this further reduces our research to the status of junk science. Any research enterprise ultimately depends on credibility for success, and when you give up your credibility, you’ve given up everything.
Thus, a few months ago I would have said I think the overwhelming majority of people doing research would never resort to such behavior, even if I completely disagreed with their GASSSPP-based conclusions. I now have to add that to something I am NOT saying any more.
But at the end of the day, I am NOT saying that I don’t see any hope for change. I’m very disappointed that such change hasn’t happened during my career, but I think there is a sufficient mass of scholars out there that change can be brought about, and from within the profession; history shows that many successful change movements are the product of a minority of a population. Among other things, we can create a community of practice that puts pressure on editors and reviewers to face up to the need for scientific standards in our research, and we can push back when editors and reviewers make mistakes or step too close to the edge of good scientific practice. One of the good things about the Web that makes blogs like this possible is that it also enables the kinds of recommendations made here, and opens the door for other recommendations from people smarter than me. That’s all good.
References for this page
Anonymous, Beyond quality in the search for a lengthy vitae. Journal of Social Behavior and Personality. 1987; 2: 3-12.
Bedeian, Arthur G.; Taylor, Shannon G., & Miller, Alan N. Management science on the credibility bubble: Cardinal sins and various misdemeanors. Academy of Management Learning & Education. 2010 Dec; 9(4):715-725.
Carver, Ronald P. The case against statistical significance testing. Harvard Education Review. 1978:378-399.
Edwards, Jeffrey R. To prosper, organizational psychology should …overcome methodological barriers to progress. Journal of Organizational Behavior. 2008; 29(4):469-491.
Hubbard, Raymond & Armstrong, J. Scott. Why we don’t really know what “statistical significance” means: a major educational failure. Journal of Marketing Education. 2006 Aug; 28(2):114-120.
Huberty, Carl J. Historical origins of statistical testing practices: the treatment of Fisher versus Neyman-Pearson views in textbooks. Journal of Experimental Education, Statistical Significance Testing in Contemporary Practice. 1993 Summer; 61(4):317-333.
McCloskey, Deirdre N. & Ziliak, Stephen T. The standard error of regressions. Journal of Economic Literature. 1996 Mar; XXXIV:97-114.