Research

Published

Gechun Lin and Christopher Lucas (2023). “An Introduction to Neural Networks for the Social Sciences.” Oxford Handbook of Methodological Pluralism in Political Science. Oxford University Press [publisher site]

Gechun Lin. 2025. “Using Cross-Encoders to Measure the Similarity of Short Texts in Political Science.” American Journal of Political Science. [publisher site][draft] [appendix]

Abstract

In many settings, scholars wish to estimate the similarity of political texts. However, the most commonly used methods in political science struggle to identify when two texts convey the same meaning as they rely too heavily on identifying words that appear in both documents. This limitation is especially salient when the underlying documents are short, an increasingly prevalent form of textual data in modern political research. Building on recent advances in computer science, I introduce to political science cross-encoders for precise estimates of semantic similarity in short texts. Scholars can use either off-the-shelf versions or build a customized model. I illustrate this approach in three examples applied to social messages generated in a telephone game, news headlines about US Supreme Court decisions, and Facebook posts from members of Congress. I show that cross-encoders, which utilize pair-level embeddings, offer superior performance across tasks relative to word-based and sentence-level embedding approaches.

Dahjin Kim, Gechun Lin, and Keith Schnakenberg. 2024. “Informative Campaigns, Overpromising, and Policy Bargaining.” Journal of Theoretical Politics. [publisher site][draft]

Abstract

What is the relationship between policy positions taken in campaigns and those proposed in bargaining when the final policy outcome depends on other political actors? Why do candidates sometimes advocate policies in their campaigns that are unlikely or impossible to pass given the preferences of other actors in the government? We analyze a model in which candidates make non-binding policy platform announcements and then bargain with a veto player over the final policy if they take office. In the model, a candidate has private information that is related to the policy preferences of a key citizen group and engages in bargaining with a veto player who is responsive to this information. When the citizen’s group sometimes interprets campaign promises naively, elections are more likely to allow information revelation. Furthermore, in this case, politicians overpromise: the politician’s platform is outside of the range of feasible bargaining outcomes.

Dahjin Kim, Gechun Lin, and William G. Nomikos. 2025. “American Twitter users have ideological differences of opinion about the War in Ukraine.” Humanities and Social Sciences Communications. [publisher site][draft] [appendix]

Abstract

Though ideological differences have long been a ubiquitous feature of American politics, the rise of online news and social media has exacerbated divisions between groups. While existing research has documented how political preferences manifest online, relatively few studies have considered whether ideological divisions extend to discussions of foreign policy. We examine this question by analyzing nearly 2 million tweets about the war in Ukraine posted by Americans during the opening stages of the Russian invasion. We first categorize each tweet according to the user's ideological leanings estimated by the network of political accounts they follow. Then, we apply a natural language processing model specifically designed for short texts to classify the tweets into clusters that we hand code into substantive topics. We find that the topic distributions of conservative, moderate, and liberal users are substantively and statistically different. We further find that conservatives are more likely to spread some form of misinformation and that liberals are more likely to express support for Ukraine. Our paper concludes with a discussion of the implications of our findings for the conduct of U.S. foreign policy.

Working Papers

Gechun Lin and Benjamin Noble. “In Control but Incoherent: Institutional Power and Message Discipline in Congress.” [draft]

Abstract

Parties build electorally beneficial brands by staying “on message.” But when can congressional parties exercise message discipline, who contributes, and how do constituents respond? We build on theories of congressional party discipline to develop a set of competing hypotheses: that institutional power could help or hinder messaging, that Republicans are more effective, and that marginal members contribute less. We test these expectations in House and Senate floor speeches (1973–2016) with a novel, direct, and systematic measure of message discipline using topic models and contextual embeddings. We show that, generally, institutional power weakens message discipline. However, our results are nuanced: House Republicans leverage procedural power to offset this disadvantage, and marginal members are less likely to be on message. Finally, we provide behavioral evidence that message discipline shapes constituency approval of representatives. Our results contribute to the literature on message politics and have implications for legislator orientation and thermostatic backlash.

Gechun Lin. “Using Generative AI to Extract Emphasis Frames.” Draft available upon request. [poster]

Abstract

Framing analysis is at the core of studies in political communication. The current literature relies mainly on topic modeling and dictionary approaches to identify frames from texts. However, users cannot control desired topic outputs in unsupervised models, such as LDA and STM; moreover, the resulting groups of keywords lack semantic contexts for exploring how things are framed. Dictionary approaches also have limitations: existing ones would miss novel frames in evolving discourse, and creating new dictionaries is resource-intensive. Instead, I propose a new method that follows three steps—quote, summarize, and name—to extract frames using generative AI. I apply this method to re-examine the framing of smoking ban policy at the issue-definition stage during its diffusion across 49 US states. Compared to traditional topic models, the new method produces more semantically distinctive and coherent high-level frame features for identifying different frames. Empirically, this results in the discovery of more nuanced frames and their coexisting patterns.

Work in Progress

Gechun Lin. “Media Framing of SCOTUS Decisions: do split decisions facilitate politicized coverage?”