“Decoding Publication Records”: A Digital Humanities Approach to Taiwanese Literary History

This is Number 6 in the “JAS Author Interviews” series at #AsiaNow. Click here to see all posts in the series.

Táňa Dluhošová is a research fellow at the Academy of Sciences of the Czech Republic and author of “Decoding Publication Records: Ruptures and Continuities in 1940s Taiwanese Literary History,” which appears in the May 2020 issue of the Journal of Asian Studies. Here, Dluhošová discusses with Bert Scruggs (University of California, Irvine) the methodologies she drew on to carry out her analysis of publications and literary figures in 1940s Taiwan.

Bert Scruggs: In a nutshell, your essay forced me to reconsider conventional narratives of twentieth-century Taiwanese literary history. Could you more abstractly explain from which disciplinary nodes your research grows, and which edges, or connections, amongst those nodes you plan to explore in the near future?

Táňa Dluhošová: The way I look at literature is strongly influenced by the Czech structuralist school. This inspiration from such a text-centered theory might at first blush appear surprising, because what I originally attempted in my PhD thesis and later carried out a little more systematically in the JAS article was what Franco Moretti calls “distant reading,” an approach seemingly detached from the specifics of the literary text itself. Felix Vodička, a Czech structuralist, proposed a gradual, hierarchical study of literary writings. Starting from the most basic linguistic level, the syllable, he suggested to proceed to words, individual verses, entire poems, series of poems, the collected works of individual authors, groups of authors, and finally entire periods in literary history. I then became attracted by the idea to study the larger ones of Vodička’s units and eventually ended up doing distant rather than close reading.

Sociology of literature, as outlined by Pierre Bourdieu in his theory of the literary field, offered me a fresh approach to literary production and a way to understand it from a contemporaneous perspective, not from a point of view that would be only important for us in the 21st century. So, I attempted to find out how literature was produced in Taiwan, and I set out to reconstruct the literary field in as much detail as possible by unearthing contemporaneous norms of literary production. I started by reading periodicals from cover to cover and realized that some authors appear in groups in several periodicals. First, I noted down appearances of this phenomenon with pen and paper, but as the number of periodicals I read grew bigger, I decided to record my findings in an Excel table. After that, the path of data analysis was set out for me, and I looked around for suitable tools to analyze the dataset I had collected. After trying out several possibilities, I realized that Social Network Analysis with its particular means of visualization can be applied. Social networks usually deal with direct (e.g., correspondence, family or friendship networks) or indirect (e.g., shared memberships in clubs or associations) human interactions. Frequencies and distributions of publications in journals could be treated as indirect relationships in my analysis.

My ultimate goal was to find out who the most important figures in the field were, and what made them important. But one cannot apply the method naively. I have to emphasize that I did not measure something that was simply out there before I started my analysis. In other words, I treated the distributions and frequencies I had recorded as proxies for abstract concepts like “popularity” or “symbolic capital.” Other scholars may find other literary features, quantitative or qualitative, more suitable for reconstructing the literary field.

Relationships between authors, publishers, patrons of literature, the state, and other actors have, of course, been discussed both with regard to other literary traditions as well as, to some extent, in the context of Taiwanese literature. What I think is novel about my approach is that one can actually see some parts or aspects of the literary field—a set of relationships among social actors—when some of these relationships are singled out and tracked numerically. Quantitative analysis helped me to reduce the literary field as an abstract totality to a specific set of relationships. On the basis of a concrete dataset, I could then analyze patterns among these relationships, assuming that, taken together, my proxy values added up to a larger representation of the field that reflected the characteristics I was interested in. In a way, this is a reductionist approach, and one has to be cautious not to jump to far-reaching conclusions on the basis of data analysis alone. After all, the analysis is limited by the method of analysis as well as the scope of the data.

Furthermore, the results of quantitative methods, which can be subsumed under the umbrella term of Digital Humanities, has to be contextualized. There is software available which allows one fairly easily to create interesting images displaying various relationships. If one does so without concrete research questions, however, and an adequate historical understanding of the period under consideration, the resulting graphs will just be visually appealing but ultimately meaningless pictures with a colorful, cloud-like appearance somewhat resembling jelly-fish, and equally difficult to pin down interpretively. It was during some detours I took in my research that I came to truly appreciate the importance of historical context. For example, only because I studied censorship in early post-war Taiwan based on archival material, was I able to understand why there was a considerable increase of literary supplements in late 1947 and in 1948. From correspondence between the Provincial Committee of Propaganda and the Ministry of Interior it emerged that newspaper supplements were not obliged to register separately. They could be freely created within an existing newspaper, and so they enjoyed a certain degree of independence.

In the near future I would like to use a historical approach based on archival material to study the roles of certain writers and editors who, according to my findings, remained prominent in multiple historical periods. I would like to find out what helped these figures gain prominence, and what their contribution was in each period.

Bringing history and literature into a conversation can be very fruitful. Literary historians might find it beneficial—as well as genuinely exciting—to spend a little more time in historical archives to understand certain trends on the literary scene. Conversely, literary critics can help historians to get a fuller grasp of certain social phenomena. For example, Taiwanese historians working on the Taiwan Biographical Database under the direction of Professor Chang Su-bing (National Taiwan Normal University, NTNU) recently realized that poetic societies were important venues for many members of political and business elites—a phenomenon literary historians have been well aware of for a while. I believe that if historians cooperated with literary scholars a bit more, they would realize more clearly that membership in poetic associations may not just generate economic, political, or social gains for their members. Not surprisingly, perhaps, such memberships could also express stylistics and broader literary preferences, predilections which are not usually accorded great relevance in historical studies. I tried to emphasize this aspect for the case of early post-war literary journals and supplements in the JAS article.

Historical context is crucial, but of course the findings of quantitative analyses need to be situated within the discourse and problems of literary studies as well. I believe that we can detect the prevalent literary styles and genres of a journal if it reflects on some level the preferences of its editors. Keeping Vodička’s progression from smaller to increasingly larger literary units in mind, we may attempt the same for clusters of journals. If we do this more extensively and systematically, we can approach something like a comprehensive description of styles of literary production in a given period, as far as that is possible. But how can one determine the important writers, journals, and publishers as well as the prevalent styles of writing? I believe that publication data (e.g., frequency of publication, distributions, publishers’ data) and close observation of different actors participating in the production process (like officials involved in censorship, for instance) can give us hints as to where to look. I am planning to use a “big data” approach, expand my scope of research, and describe long-term structural changes underlying the history of post-war Taiwan literature in an English-language monograph.

In addition to clustering actors in the literary field according to their economic, social, and literary preferences there are some other aspects that require attention, e.g., shared world views. Politics and ideology are never entirely distinct from literary concerns in the period I am working on. Together with a corpus linguist, Dr. Alvin C. H. Chen (NTNU), I developed a new method to explore this phenomenon (our work is currently under review). We constructed a corpus of early postwar articles of a cultural orientation (1 million characters). Based on word frequencies and collocations of distinctive keywords we identified three main networks of ideologically loaded vocabulary in this corpus. We traced these semantic networks back to journals and authors which we treated as different networks. This allowed us to pinpoint core elements and follow their relationships to other words, journals, or individuals. Because this analysis pointed us toward groups of authors who supported certain world views, we applied positional analysis as used in the study of political elites to identify different types of organizations with which the authors were involved. In this way we arrived at partial sociological characterizations of these groups and were thus able to put their ideological standpoints into a broader perspective. In this respect, our research combines elements of corpus linguistics, intellectual history, literary studies, and sociology. Which, unfortunately, can also result in certain difficulties to find a place to get it published.

As suggested by our approach to the study of semantic networks, I see authors, editors, and journalists as elite members. Elite studies straddle history and sociology, a fortunate circumstance which has given me new methods and research questions to work with. To pursue these questions, I am building a database of Taiwanese elites, the Taiwan Biographical Ontology (TBIO). It currently includes 28,406 individuals but keeps growing. It will hopefully help me to understand how cultural elites related to other social groups. For example, in my recent study (under review) of old and new Taiwanese elite families from the Japanese and early postwar period (altogether about 4,065 individuals), I found out that one of the characteristics of the older families, whose history can be traced back to the Qing dynasty, was their close interaction with the cultural field through such activities as participation in poetic societies, their literary patronage, and the sponsoring of periodicals. This finding then prompted me to think about how these families, through their cultural activities, combined various types of prestige, “portfolios of prestige,” as I call them. These allowed individuals and social groups to occupy important, at times dominant positions in certain fields during different time periods. I am planning to expand this study, which is concerned with the Japanese and early post-war periods, into more recent times, hoping that this will allow me to observe the interaction between economic and political elites with the literary field.

There is a larger motivation behind this reorientation, one which goes beyond an interest in the relationship between literature and forms of influence, or power. The study of elites tackles such issues as social mobility, inequality, the structure and legitimacy of the economic order, political power structures, and multiple forms of social domination––all of these hotly contested issues in contemporary discussions among historians, sociologists, and political scientists. Recently, the historian Walter Scheidel (2017) and the economist Thomas Piketty (2014) both forcefully noted that the accumulation of wealth at the top of society is a source of persistent and potentially accelerating material inequality––a pervasive trend across most known settled societies that, according to Scheidel, is only reversed by cataclysmic events. Gregory Clark’s 2014 study of long-term social mobility across centuries and different regions concludes, in a similar vein, that social status is inherited “as strongly as any biological trait” and that even the arrival of free public education, the reduction of nepotism through the strengthening of institutions, and the invention of private companies did not, in fact, significantly increase social mobility.

The manuscript under submission that I mentioned, “Portfolios of Prestige,” traces the history of Taiwanese elites across the political and societal rupture of 1945. By analyzing portfolios of prestige which encompass political, economic, and cultural capital, the project addresses the question of what kinds of capital, in Bourdieu’s sense, allowed the old and the new Taiwanese elites to obtain leading social positions, and how the old elites maintained their dominance before and after the war. In this respect, my study of elites resembles the problem of contested ruptures and continuities within the literary field as discussed in the JAS article, but the scope has now grown to include broader questions about Taiwanese society as a whole, questions which, as the insights and hypotheses of Scheidel and others suggest, are also of great concern to societies elsewhere, in the face of the economic fallout from the Covid-19 pandemic perhaps even more so than before.

Scruggs: Could you explain more concretely how anonymous authors create noise, and the possibility or impossibility of forming hypotheses on censorship by visualizing anonymous authors?

Dluhošová: Anonymous entries in my case usually represented editorials, entitled, for example, “After Editing” (編後). It is likely that they were written by the editors-in-chief or editors. But there were two reasons why I did not include these anonymous authors. First, I was reluctant to add information which I was not sure of. Sometimes multiple editors were active at a given moment, as in the case of Zhengjing bao, and I could not attribute each entry properly. When one decides to use quantitative methods, one has to ensure that one follows clear rules as to what data is included and what is left out, as decisions based on guesswork can undermine one’s argument. Second, every node (author) has to have a unique name. I couldn’t use one node called “anonymous” because all articles without author attribution from all other periodicals would be linked to that single node. That would result in an unwanted distortion of the network, suggesting, quite wrongly, that the “anonymous” super node was an important author contributing to all journals, and overshadowing all other authors. The alternative—assigning to each anonymous entry a unique ID—would create a host of one-off contributions by single authors, a kind of contribution not considered anyway.

Pseudonyms represent a very peculiar literary problem. We all know that authors were using different pen names to conceal their gender, to avoid harassment or criticism, or to get more texts published in the same journal (we should not forget that authors were paid for their contributions). We have learned about pen name practices in modern Chinese literatures from, for example, studies about Lu Xun’s pseudonyms. In the early post-war Taiwanese context authors also used pen names to avoid KMT censorship. Authors could, however, use a pseudonym to assume a new persona as well. The question is whether, in such cases, we should treat these distinct personas as one author or two. Either possible choice has its pros and cons. For one thing, the author made a deliberate decision not to publish under his/her real name, and we can treat his/her distinct personas as separate cases embodying, for example, diversity of opinions or style. I applied this reasoning with Dr. Alvin Chen in our co-authored article about ideology in the early post-war period (under review).

But identification of pen names is an important sub-discipline in literary history, especially if we want to understand the complexities of an author under scrutiny. If there is a large enough corpus of attributed texts, one can even use corpus linguistic/DH methods to authenticate unattributed texts or parts of texts with stylometry. There are, of course, well-known examples of such research as applied to works by, or suspected to be by, or attributed to Shakespeare or Dickens (e.g. Tomoji Tabata 2019 for Dickens).

As for my study of the Taiwanese literary scene, I identified authors behind pen names whenever possible and in biographical dictionaries or online resources and catalogues I checked all authors who published in four and more journals (64 names), and did the same for authors with the highest publication frequencies in respected periodicals. While I was quite successful at identifying authors who published anonymously in the supplement Qiao and the journal Taiwan wenhua, I failed in many cases for the supplement Haifeng. That is because this period is very sparsely covered in the history of Taiwanese literature, and both the early post-war period and the early 1950s witnessed the influx of a huge wave of new authors whose identity we simply do not know. Qiao and Taiwan wenhua were platforms where many later acclaimed authors published, therefore there is more literature about them.

I tried to pair the pen names and authors because the editors most probably knew the identity of the authors. There is a notification in one of the editorials of Zhengjing bao saying that authors should include their actual names and addresses in their submission because of the payment, and that the editorial office would not consider anonymous contributions. A very similar note appeared in the supplement Taiwan funü. So if the editors knew the authors’ identity, this knowledge could have influence their editorial decisions as to whether to accept or decline a submission.

There is also a reason why I did not exclude unidentified pseudonyms from the dataset. First, even if we do not know an author’s true identity, we can map the position of the pseudonymous name in the network creating the pattern I was interested in. With every newly identified pen name the pattern will change slightly, but I believe that the general structure would largely remain stable. Secondly, the SNA analysis (less the visualization as published in the JAS article) can indicate which “name” was active in more journals or in one particular journal (a proxy for symbolic capital) and is hence a potentially interesting object of further analysis. We can zoom in on these authors, analyze their work in comparison to more famous authors, and thus better understand the range of literary styles in a given period. So in this way, inclusion of pen names can help to find new perspectives for our research.

I can’t think of any way how the presence of pen names can help to form any hypothesis on censorship in the dataset I used for this analysis. One could perhaps assume that a high number of pseudonyms points to stricter censorship. But historical archives can, in my view, give us a better understanding of the contemporaneous situation, and of why and how published texts were censored.

There was, however, one interesting observation. I noticed a surge of supplements with cultural contents after the 2.28 Incident—a period usually seen as a time of persecution of Taiwanese elites and a period of silence in the public sphere. I always found this inconsistency intriguing, but I could not explain it from the data itself. Only the study of historical materials, i.e. archival documents from the Committee of Propaganda and the Provincial Bulletin, where all newly published government decrees were promulgated, provided me with an understanding of the publishing process and the relationships between publishers and authorities engaged in censorship. Only then did I understand that editors had found a way around the Publication Law and the obligation to register the journal which would have made it subject to official control.

Scruggs: You noted that words and collocates tokenized and extracted from a number of journals by yourself and Alvin C. H. Chen revealed groups of authors which supported one or another ideology. Since national and local ideologies were in tremendous flux among mid-century Taiwanese literary coteries, could you share with us some of you findings? And, specifically relate them to your essay in JAS? Where are the notable edges and nodes among culturally local, Taiwanese writers and culturally national concerns of the Republic of China or simply China writ large? And, how do these findings resonate with the conventional historical narrative? Do they, like those in your JAS essay, force a summary reconsideration of the era?

Dluhošová: Both articles adapt quantitative methods to the study of the post-war Taiwanese cultural and intellectual scene. They use different types of data, though. Whilst the JAS article works with information about where and how often authors published, the article about semantic networks analyzes the vocabulary of 1,168 culturally oriented articles from the same set of journals as the JAS article. In a way, these two articles are therefore complementary.

The article about semantic networks links together authors who shared similar, often ideologically loaded, vocabulary. We identified three clusters which can be linked to contemporaneous ideological standpoints: (a) China-oriented Nationalist official discourse discussing the ROC’s international relations, alternative constitutional orders, and issues directly concerned with how to maintain patriotism, nationalism, and the Three People’s Principles both in China and in Taiwan; (b) local official discourse propagating the new Sinicization policies in Taiwan; and (c) oppositional literary discussions on Taiwanese subjectivity.

The first two are usually viewed as a single, official, KMT-backed ideology, but our analysis shows that they can be treated independently as they were introduced by different sets of authors with different agendas. In the study of post-war Taiwan we usually pay a lot of attention to the local Sinicization policies (i.e., local official discourse), but we often forget the bigger ideological framework encompassing the status and governance of the ROC as a whole. All too often, we neglect that bigger framework in our search for Taiwanese specifics. But this framework existed and had a vast impact on the lives of all inhabitants of Taiwan. This analysis thus reminded us of the close links between Taiwanese- and ROC-centered discourses, which we otherwise tend to separate.

We also investigated the association between official key terms and periodicals. The networks of periodicals we constructed represent one of the clusters which also emerged from the research I did for the JAS article. Through their authors, these journals had close connections to the field of power (various governmental agencies) and academia, as is shown in the JAS article and in the semantic network article, in an analysis of social backgrounds of the authors grouped in each cluster.

The third semantic network whose focal points were the keywords “Taiwan” and “literature” can be associated with the well-known debate about the character of Taiwanese literature (1948–49). Our article examines to which extent these three clusters were interlinked. We found that the cluster of literary debates has a closer relationship with the above-mentioned clusters of official ideological standpoints, but a rather loose relationship with other clusters, which were dedicated to discussions about concrete literary pieces or which were themselves pieces of literary writing. This observation points to the political character of the discussion which laid the foundation for the definition of Taiwanese literature in the 1980s. Texts in this semantic cluster mainly come from more autonomous periodicals.

The first two official clusters are dominated by Mainlanders—no surprises there. The third cluster, characterized by literary discussions, connects both Mainlanders and Taiwanese because they were engaged in a dialogue. This appears like a positive sign suggesting a fairly autonomous character of the field. Interestingly, the majority of the Taiwanese writers are among those whom I identified in the JAS article as authors who bridged the 1945 divide. The analysis of the semantic networks and the networks of authors derived from them, therefore, confirms the unique status of these Taiwanese and the actual weight of their symbolic capital in the early post-war period.