(Yicai Global) July 7 -- A claim by Harvard University researchers that the coronavirus was present in the Chinese city of Wuhan before last winter has sparked controversy in academic circles, with Chinese scholars penning a paper that questions the methodology used and conclusions presented. Two academic platforms have declined to publish this rebuttal.
The Harvard team’s work, published last month, claimed analysis of satellite images of hospital parking lots in Wuhan and of internet searches using the keyword ‘diarrhea’ suggested the contagion was present in Wuhan, the city thought to have been where the outbreak started, by the fall of last year. The study was widely reported in global media.
A recent 15-page response co-signed by other scholars from Harvard University, the University of Göttingen, China’s Zhejiang University and Huazhong University of Science and Technology flagged several problems and even misconduct in the original Harvard manuscript, including inappropriate and insufficient data, misuse and misinterpretation of statistical methods, and the “cherry-picking” of internet search terms. This response has been proofread by nine scholars from reputable universities around the world, according to the paper.
Yet this weekend, academic platforms arXiv and medRxiv refused to publish the paper co-signed by the scholars at the five academic institutions.
“Our moderators have determined that your work would benefit from additional review and revision that is outside of the services we provide,” arXiv told them in a document seen by Yicai Global. MedRxiv refused to publish the paper, saying it is commentary.
The decision has attracted the attention of well-known scientific investigative journalist Leonid Schneider and professional academic fraud fighter Elisabeth Bik. Schneider questioned medRxiv's decision on his Twitter account, saying the platform has allowed scholars to re-analyze previous studies. Upon the authors’ follow-up, medRxiv has reinstated their submission for consideration.
The authors of the Harvard-Göttingen-Zhejiang paper included a mathematician, a statistician and three medical scientists. They described the COVID-19 pandemic as not just a public health crisis, but also one of scientific publishing.
They are calling on all research repositories, including institutional ones such as Harvard’s DASH, to implement basic quality control.
Statistical Issues, Cherry-Picking
The co-signed paper is only available on the authors’ own platforms. In it, they questioned the samples used in the Harvard study, which counted vehicles in the parking lots of six hospitals. They included Hubei Women and Children's Hospital, a medical institution that specializes in gynecology, obstetrics, and pediatrics. It has no pneumology department for adults, however.
The Harvard-Göttingen-Zhejiang paper also argued that vehicle count data was insufficient and not uniform. In the 29 months from January 2018 to May 2020, the Harvard study collected just 140 data points relating to vehicle count, meaning on average, each hospital had less than 1 data point per month. However, more than 30 data points accumulated in the last two months, outside of the period of interest. Thus, the data is even more sparse in the remaining months.
The Harvard study used the so-called LOESS method to obtain a smooth curve from scattered data of vehicle counts and observed in the resulting curve an increase last August. The research team seems to have deliberately attuned these parameters to bring the curve into alignment with their claims, however.
“If we change the span parameter, the smoothing curve becomes very different. With alpha=30%, we observe two new peaks, one at the end of 2018, the other in the middle of 2019. They make the peak at the end of 2019 much less significant,” the Harvard-Göttingen-Zhejiang scholars wrote in their paper.
"We think that the authors of the Harvard study are obliged to justify their choice of a 40% span over other values, if other than for their convenience," one of the authors, Ma Zhenjun, an experienced statistical researcher, told Yicai Global.
The Harvard-Göttingen-Zhejiang paper also noted that the Harvard study claims to have observed an increase in searches using the keyword ‘diarrhea,’ but the Harvard-Göttingen-Zhejiang scholars failed to reproduce that observation using the corresponding Chinese words. In response to their inquiry, the Harvard authors revealed the real keyword they used. It translates as ‘symptom of diarrhea.’
The choice of keyword is strange, as ‘diarrhea’ is not among the common symptoms of COVID-19, unlike ‘shortness of breath’ or ‘difficulty breathing.’ Even the increase in the search term ‘symptom of diarrhea’ is not Wuhan-specific, but was observed across China, the paper added.
“We think it is clear that the Harvard team was ‘cherry-picking’ the keyword, highlighting the only search term that supports their conclusion. This constitutes serious misconduct,” Ma said.