Research

Published

Gechun Lin and Christopher Lucas (2023). “An Introduction to Neural Networks for the Social Sciences.” Oxford Handbook of Methodological Pluralism in Political Science. Oxford University Press [publisher site]

Gechun Lin. “Using Cross-Encoders to Measure the Similarity of Short Texts in Political Science.” Accept pending replication at the American Journal of Political Science. [draft] [appendix]

Abstract In many settings, scholars wish to estimate the similarity of political texts. However, the most commonly used methods in political science struggle to identify when two texts convey the same meaning as they rely too heavily on identifying words that appear in both documents. This limitation is especially salient when the underlying documents are short, an increasingly prevalent form of textual data in modern political research. Building on recent advances in computer science, I introduce to political science cross-encoders for precise estimates of semantic similarity in short texts. Scholars can use either off-the-shelf versions or build a customized model. I illustrate this approach in three examples applied to social messages generated in a telephone game, news headlines about US Supreme Court decisions, and Facebook posts from members of Congress. I show that cross-encoders, which utilize pair-level embeddings, offer superior performance across tasks relative to word-based and sentence-level embedding approaches.


Dahjin Kim, Gechun Lin, and Keith Schnakenberg. “Informative Campaigns, Overpromising, and Policy Bargaining.” Accept at the Journal of Theoretical Politics. [draft]

Abstract What is the relationship between policy positions taken in campaigns and those proposed in bargaining when the final policy outcome depends on other political actors? Why do candidates sometimes advocate policies in their campaigns that are unlikely or impossible to pass given the preferences of other actors in the government? We analyze a model in which candidates make non-binding policy platform announcements and then bargain with a veto player over the final policy if they take office. In the model, a candidate has private information that is related to the policy preferences of a key citizen group and engages in bargaining with a veto player who is responsive to this information. When the citizen’s group sometimes interprets campaign promises naively, elections are more likely to allow information revelation. Furthermore, in this case, politicians overpromise: the politician’s platform is outside of the range of feasible bargaining outcomes.


Dahjin Kim, Gechun Lin, and William G. Nomikos. “American Twitter users have ideological differences of opinion about the War in Ukraine.” Accepted at the Humanities and Social Sciences Communications. [draft] [appendix]

Abstract Though ideological differences have long been a ubiquitous feature of American politics, the rise of online news and social media has exacerbated divisions between groups. While existing research has documented how political preferences manifest online, relatively few studies have considered whether ideological divisions extend to discussions of foreign policy. We examine this question by analyzing nearly 2 million tweets about the war in Ukraine posted by Americans during the opening stages of the Russian invasion. We first categorize each tweet according to the user's ideological leanings estimated by the network of political accounts they follow. Then, we apply a natural language processing model specifically designed for short texts to classify the tweets into clusters that we hand code into substantive topics. We find that the topic distributions of conservative, moderate, and liberal users are substantively and statistically different. We further find that conservatives are more likely to spread some form of misinformation and that liberals are more likely to express support for Ukraine. Our paper concludes with a discussion of the implications of our findings for the conduct of U.S. foreign policy.


Working Papers

Gechun Lin and Benjamin Noble. “In Control but Incoherent: Institutional Power and Message Discipline in Congress.” [draft]

Abstract Parties build electorally-beneficial brands through legislative cohesion. But getting to "yes" often produces divisive intra-party debates as illustrated by the cliches "Democrats in Disarray" and "Republican Civil War." We argue that institutional power undermines message discipline—the ability of co-partisans to say the same things about the same issues. Congressional majorities may agree on broad goals, but individuals must compromise and promote specific legislation. Presidents place issues on the agenda co-partisans must defend. Dis-empowered parties, by contrast, can simply oppose. To test our theory, we develop a novel, text-based measure of message discipline in House and Senate floor speeches (1973-2016) using topic models and contextual embeddings. We find that non-presidential minorities exercise stronger message discipline than presidential majorities, and on-message lawmakers are less effective legislators. However, the House majority's procedural power offsets these disadvantages. Our results deepen our understanding of congressional message politics, with implications for perceived polarization and thermostatic backlash.


Gechun Lin. “Using Generative AI to Extract Emphasis Frames.” Draft available upon request. [poster]

Abstract Framing analysis is at the core of studies in political communication. The current literature relies mainly on topic modeling and dictionary approaches to identify frames from texts. However, users cannot control desired topic outputs in unsupervised models, such as LDA and STM; moreover, the resulting groups of keywords lack semantic contexts for exploring how things are framed. Dictionary approaches also have limitations: existing ones would miss novel frames in evolving discourse, and creating new dictionaries is resource-intensive. Instead, I propose a new method that follows three steps—quote, summarize, and name—to extract frames using generative AI. I apply this method to re-examine the framing of smoking ban policy at the issue-definition stage during its diffusion across 49 US states. Compared to traditional topic models, the new method produces more semantically distinctive and coherent high-level frame features for identifying different frames. Empirically, this results in the discovery of more nuanced frames and their coexisting patterns.


Work in Progress

Gechun Lin. “Media Framing of SCOTUS Decisions: do split decisions facilitate politicized coverage?”