RECKONING WITH THE PAST
和过去做个了结

Sometimes I wonder if I should be fixing myself more to drink.

有时候我辗转反侧,不知是否该借酒消愁。

No, this is not going to be an optimistic post.

没错,这不是一篇鸡汤文。

If you want bubbles and sunshine, please see my friend Simine Vazire’s post on why she is feeling optimistic about things. If you want nuance and balance, see my co-moderator Alison Ledgerwood’s new blog*. Instead, if you will allow me, I want to wallow.

如果你想要泡沫和阳光,我朋友Simine Vazire的文章会告诉你为什么她如此积极乐观。如果你想要情绪间的微妙平衡,看我同僚Alison Ledgerwood的新博客。而我,只想好好吐槽一番。

I have so many feelings about the situation we’re in, and sometimes the weight of it all breaks my heart. I know I’m being intemperate, not thinking clearly, but I feel that it is only when we feel badly, when we acknowledge and, yes, grieve for yesterday, that we can allow for a better tomorrow. I want a better tomorrow, I want social psychology to change. But, the only way we can really change is if we reckon with our past, coming clean that we erred; and erred badly.

我对我们现在的处境有太多的感触,这有时沉重得让我心力交瘁。我知道我失去了自控,头脑不清楚。但我觉得只有当我们直面昨日,为昨日沉痛伤感,才能拥有美好的明天。我渴望美好的明天,我希望社会心理学能改变。但是,唯一能使我们真正改变的是和过去做个了结,坦白过去所犯的严重错误。

To be clear: I am in love with social psychology. I am writing here because I am still in love with social psychology. Yet, I am dismayed that so many of us are dismissing or justifying all those small (and not so small) signs that things are just not right, that things are not what they seem. “Carry-on, folks, nothing to see here,” is what some of us seem to be saying.

首先声明:我热爱社会心理学。我在这儿码字就是因为我依然爱它。然而,让我感到泄气的是,尽管很多微小(其实并非如此微小)的迹象表明情况不妙且另有隐情,我们之中许多人却对所有这些迹象视而不见或想出种种理由开脱。“继续,伙计,这儿没啥好看的,”我们中有些人似乎在这么说着。

Our problems are not small and they will not be remedied by small fixes. Our problems are systemic and they are at the core of how we conduct our science. My eyes were first opened to this possibility when I read Simmons, Nelson, and Simonsohn’s paper during what seems like a different, more innocent time.

我们的问题不小,想轻易补救谈何容易。我们的问题是系统性的,而且密切关系到我们如何进行科研。我起初发现有可能出了问题是在我读了 Simmons, Nelson, 和Simonsohn合著的论文之后,那时情况看起来和如今还有所不同,还是一个更纯真的年代。【编注:该论文发表于2011年】

This paper details how small, seemingly innocuous, and previously encouraged data-analysis decisions could allow for anything to be presented as statistically significant. That is, flexibility in data collection and analysis could make even impossible effects seem possible and significant.

这篇论文详细阐述了那些之前受鼓励的微小且看似无害的数据分析是如何让事物呈现出统计意义的。那就是,灵活的数据收集和分析可以让那些实际不可能的作用变得可能并且显著。

What is worse, Andrew Gelman made clear that a researcher need not actively p-hack their data to reach erroneous conclusions. It turns out such biases in data analyses might not be conscious, that researchers might not even be aware of how their data-contingent decisions are warping the conclusions they reach. This is flat-out scary: Even honest researchers with the highest of integrity might be reaching erroneous conclusions at an alarming rate.

更糟的是,研究者无需主动挖掘数据就能得到错误的结论,这点被Andrew Gelman解释得很清楚。事实是,研究者在数据分析中的偏见可能不是有意识的,他们甚至没有意识到自己依据数据做出的决定正在歪曲他们最终得到的结论。这可怕至极:即使最诚实,最正直的研究者也有可能以高得吓人的几率得出错误的结论。

Third, is the problem of publication bias. As a field, we tend only to publish significant results. This could be because as authors we choose to focus on these; or, more likely, because reviewers, editors, and journals force us to focus on these and to ignore nulls.

接下来还有发表过程中的偏见。在特定领域中,我们只倾向于发表具有显著意义的结果。这可能是由于作为作者我们选择把注意力放在这些结果上;或者,更可能的是,因为审稿人,编辑和期刊迫使我们把注意力放在具有显著意义的结果上,而忽略那些零结果的研究。

This creates the infamous file drawer that altogether warps the research landscape. Because it is unclear how large the file drawer is for any research literature, it is hard to determine how large or small any effect is, if it exists at all.

这就导致了臭名昭著的"文件抽屉"问题(即发表偏见问题),最终歪曲了整个研究领域的形态。由于对任何研究文献我们无法知道其中的“文件抽屉”有多大,我们很难确定该问题所产生的某种影响有多大,假如该影响确实存在的话。

I think these three ideas—that data flexibility can lead to a raft of false positives, that this process might occur without researchers themselves being aware, and the unknown size of the file drawer—explains why so many of our cherished results can’t replicate. These three ideas suggest we might have been fooling ourselves into thinking we were chasing things that are real and robust, when we were pursuing neither.

我认为以上三点——数据的灵活性可能导致大量错误结论,且这一过程可能在研究人员不经意间发生,以及“文件抽屉”尺寸大小的不明——很好地解释了为什么众多我们所珍视的研究成果无法被重复。这三点表明我们可能一直以来自欺欺人以为自己在探求真实且坚实的结果,而事实上我们所追求的既不真实也不坚实。

As someone who has been doing research for nearly twenty years, I now can’t help but wonder if the topics I chose to study are in fact real and robust. Have I been chasing puffs of smoke for all these years?

作为一个做了近20年研究的人,我忍不住怀疑过往研究的课题是否有确凿的依据立论。这些年来我致力探求的是否只是海市蜃楼?

I have spent nearly a decade working on the concept of ego depletion, including work that is critical of the model used to explain the phenomenon. I have been rewarded for this work, and I am convinced that the main reason I get any invitations to speak at colloquia and brown-bags these days is because of this work.

我曾用将近十年的时间来研究“自我耗尽”的概念,包括对解释该现象的模型至关重要的一些工作。我因此项研究获奖,同时我确信现在我之所以能受邀在众多学术讨论会发言并白吃白喝都是因为此项研究。

The problem is that ego depletion might not even be a thing. By now, many people are aware that a massive replication attempt of the basic ego depletion effect involving over 2,000 participants found nothing, nada, zip. Only three of the 24 participating labs found a significant effect, but even then, one of these found a significant result in the wrong direction!

问题在于,“自我耗尽”这个概念可能根本就不存在。时至今日,许多人都知道一项由两千余人参加的试图重复“自我耗尽”效应的大规模研究最终什么都没发现,一片空白。二十四个参与研究的实验室中只有三个发现显著的效应,但即使这样,其中一个发现的显著效应竟然是反向的!

There is a lot more to this registered replication than the main headline, and there is still so much evidence indicating fatigue is a real phenomenon. I promise to get to these thoughts in a later post, once the paper is finally published. But for now, we are left with a sobering question: If a large sample pre-registered study found absolutely nothing, how has the ego depletion effect been replicated and extended hundreds and hundreds of times? More sobering still: What other phenomena, which we now consider obviously real and true, will be revealed to be just as fragile?

此次记录在案的重复性研究留下的不仅仅是一个标题,同时,还有大量的证据表明“疲劳”是真实存在的现象。我承诺一旦我的论文最终发表,我会在之后的博客文章中加以阐述。但现在,令人警醒的问题则是:如果此前大量的研究毫无斩获,那么“自我耗尽”的效应是如何成千上万次地被复制并延伸的呢?更令人警醒的:其它那些我们认为真实无疑的现象,又会不会同样经不起检验呢?

As I said, I’m in a dark place. I feel like the ground is moving from underneath me and I no longer know what is real and what is not.

如我所说,我身处黑暗之地。我感觉似乎脚下的土地都在移动,而我已经辨不清真实和虚假了。

I edited an entire book on stereotype threat, I have signed my name to an amicus brief to the Supreme Court of the United States citing stereotype threat, yet now I am not as certain as I once was about the robustness of the effect. I feel like a traitor for having just written that; like, I’ve disrespected my parents, a no no according to Commandment number 5.

之前我编辑了《刻板印象的威胁》一书,我还签署了一份美国最高法院的法庭陈述并引用了《刻板印象的威胁》,但如今我对该效应的确凿程度却不如过去那样坚定。写下这些文字,让我觉得自己像个叛徒。这感觉如同我对父母大不敬,触犯了十戒第五条。

But, a meta-analysis published just last year suggests that stereotype threat, at least for some populations and under some conditions, might not be so robust after all. P-curving some of the original papers is also not comforting.

但是,去年一项“元分析”(对以往的研究结果进行系统的定量分析)的研究表明,”刻板印象威胁”在一些特定条件下对于一些特定人群可能并不适用,此外对一些原始论文作p值统计曲线的结果同样不让人放心。

Now, stereotype threat is a politically charged topic and there is a lot of evidence supporting it. That said, I think a lot more pain-staking work needs to be done on basic replications, and until then, I would be lying if I said that doubts have not crept in. Rumor has it that a RRR of stereotype threat is in the works.

如今,“刻板印象威胁”是一个政治上受攻击的话题,也受很多有力证据的支持。在这样的情况下,我认为在基础的重复性研究上还有更多艰苦的工作需要做,在这之前,我若说对该效应没有疑问那肯定是在撒谎。有传言称,在之前的很多关于“刻板印象的威胁”的工作中存在着危险信号。

To be fair, this is not social psychology’s problem alone. Many other allied areas in psychology might be similarly fraught and I look forward to these other areas scrutinizing their own work—areas like developmental, clinical, industrial/organizational, consumer behavior, organizational behavior, and so on, need an RPP project or Many Labs of their own. Other areas of science face similar problems too.

公正地说,不止是社会心理学领域存在此问题。心理学中的许多其它类似领域可能同样受影响,我希望这些领域中的研究工作被仔细检验,如进化的、临床的、产业的/组织的、消费行为的、组织行为的心理学等等,都需要一个研究参与池项目【译注:RPP,Research Participation Pool,是一个协调管理研究参与对象的项目】或者“多重实验室”项目【译注:多重实验室项目,Many Labs Project是一个旨在对心理科学多种效应进行可重复性验证的项目】。其他领域的科学研究同样面临类似问题。

During my dark moments, I feel like social psychology needs a redo, a fresh start. Where to begin, though? What am I mostly certain about and where can my skepticism end? I feel like there are legitimate things we have learned, but how do we separate wheat from chaff? Do we need to go back and meticulously replicate everything in the past? Or do we use those bias tests Joe Hilgard is so sick and tired of to point us in the right direction? What should I stop teaching to my undergraduates? I don’t have answers to any of these questions.

在我消沉的这段时间,我觉着社会心理学需要推倒重建,从头来过。那么,从哪儿开始?对于哪些事我能确信不疑?在哪里我能平息我的疑惑?我认为我们学到了一些合理的东西,但如何区分成果和糟粕呢?我们是否需要回去并且一丝不苟地重复过去所有的事情呢?或者我们是否该使用Joe Hilgard厌恶至极的偏见测试来指明方向?哪些东西是我不该教授给本科生的?对所有这些问题我都没有答案。

This blogpost is not going to end on a sunny note. Our problems are real and they run deep. Okay, I do have some hope: I legitimately think our problems are solvable. I think the calls for more statistical power, greater transparency surrounding null results, and more confirmatory studies can save us. What is not helping is the lack of acknowledgement about the severity of our problems. What is not helping is a reluctance to dig into our past and ask what needs revisiting.

本篇博文注定不会有个阳光的结局。我们的问题是真切的,而且深入。好吧,我确实有几点期望:我有理由相信我们的问题是有解的。我认为更多数据支撑,对零结果研究更透明的运作,更多证实性的研究,这些可以解救我们于目前的困境。而帮倒忙的则是:缺乏对问题严重性的认知,不愿意挖掘探究我们的过去并且不愿拷问哪里出了问题。

Time is nigh to reckon with our past. Our future just might depend on it.

时候不早了,是该和我们的过去做个了结了。或许,我们的未来还指望着它呢。

········

*In case you haven’t heard, Alison started a wonderful Facebook discussion group that I have the privilege of co-moderating. If you’re tired of bickering and incivility, but still want a place to discuss ideas, PsychMAP just might be for for you.

再次安利一下,Alison开了一个非常不错的脸书讨论组,我也有幸在其中参与共同主持。如果你厌倦了互撕,但仍想找个地方抒发讨论,PsychMAP可能恰好就适合你。

翻译:龟海海(@龟海海)
校对:混乱阈值(@混乱阈值)
编辑:辉格@whigzhou

相关文章

comments powered by Disqus