Counting and Controlling the Coronavirus

By Leksa Lee, NYU Shanghai

On the morning of Thursday, February 13, the Chinese government raised the official number of confirmed coronavirus cases by a third in one day, to 60,000 (one week later, that number is now more than 74,000). Officially, the move reclassified thousands of “suspected” cases as “confirmed” using updated diagnosis criteria. But for weeks already there had been reports (in Chinese) that some healthcare workers had wanted to mark these cases as “confirmed” but were prevented from doing so by national and municipal reporting standards. The spike seemed to verify suspicions that China had been manipulating the official data to prevent panic and underplay the severity of the virus. Yet the drip-drip of information emerging on the healthcare crisis in Wuhan, the center of the outbreak, suggests that there is more going on in China’s coronavirus data than just information control.

I work at NYU Shanghai and was in the United States when the coronavirus finally began to be widely reported in mid-January. As the virus spread rapidly and our classes were pushed back two weeks, then temporarily moved online, there was a flurry of digital chatter among my colleagues and in the wider China-based international research community. How quickly were infections really spreading? When would the outbreak reach its peak? Did the official numbers of the infected mean anything, or was the truth being suppressed by a nervous government? These are much the same questions millions of Chinese people have been asking.

My colleagues in data science took the numbers seriously, analyzing them for a sign of an inflection point in the steadily growing number of new infections, and projecting date ranges for the peak. I was wary of the models they were running, as it seemed clear the underlying data was wrong. Likely coronavirus sufferers were being turned away from overcrowded hospitals and were thus not included in the statistics. When smart people run complicated mathematical models on bad numbers, they add layers of legitimacy to data that is questionable from the start.

Then February 13 came, and the numbers some observers had been crunching with dutiful brilliance were blown apart. The website I’d been refreshing five times a day for weeks to track official infection rates had to redesign all their graphs to accommodate the new reality—newly discovered to have been real all along.

One researcher in my network put it bluntly: “This is a fascinating example of the politics of Chinese statistics.”

Yet I do not see the sudden spike in cases as evidence only of clear-cut government manipulation of data. Read alongside media reports from Wuhan, the official data says a great deal about how medical infrastructure—or a lack of it—comes to function as a portal for the creation and visibility of information.

Though of course the Chinese government is not alone in manipulating data, there is a grim history of data manipulation, and even falsification, in China that resonates in many people’s memories. The most devastating example was the over-reporting of food production during the Great Leap Forward campaign in the late 1950s, leading to over-requisitioning from the countryside for the cities, and along with a slew of natural disasters, resulting in mass famine. A more recent example is the GDP growth rate the government reports every year, which observers routinely warn is unreliable. One study aiming to reveal China’s real growth analyzes satellite images of lights burning at night – less light, less growth. China’s official data has always been framed as something to see beyond and through.

Now in Wuhan, since January, constantly changing official criteria for diagnosis took away from doctors the ability to report as “confirmed” any cases but those who tested positive. For weeks, they were only allowed to test patients who were admitted to the hospital (in Chinese). Test kits themselves were also in short supply. Thus, on a given day, the maximum number of confirmed cases added to the data was the number of hospital beds and test kits available, and the number of beds was a fraction of the number of the sick people who wanted them. No bed: no test: no data point.

There are stories of families taking to ambulances or carts, or often simply walking from hospital to hospital, searching for one with space for new patients. Hospital beds and test kits became narrow portals for diagnosis, and few patients could push through to become visible as sufferers and data points.

An analysis of underreporting as political information control seems correct, but limited. It is difficult to imagine that anyone in the Chinese government desired a lack of healthcare resources. There was a clear need for more beds, more test kits; the government built two new hospitals in two weeks. But linking diagnosis to resources that were in short supply kept the official number of infections artificially low, making the crisis appear to be under better control than it was.

It is telling that, according to the Wall Street Journal, the change in diagnosis criteria that led to the spike “was motivated in part by a recent increase in hospital capacity,” particularly once the two new speed-built hospitals were completed.

In Wuhan, hospital beds and test kits emerge as critical objects. For patients and families, they are the only ticket to treatment. For those who manage official information in China, they represent a devastating lack of resources, but also a bottleneck for information control that effectively allows the crisis to be visible only within predetermined bounds.