Computational Social Science Institute

University of Massachusetts Amherst

Lecture series generously sponsored by Yahoo!

Videos of some CSSI seminars are available  here

Spring 2015

Nicholas Reich

Nicholas Reich

Assistant Professor of Biostatistics
Estimating Population Susceptibility in Dynamic Models of Infectious Disease
Friday, April 17, 2015 • 12:00pm-2:00PM
Computer Science Building, Room 150/151
Lunch will be provided, beginning at 12:00
Talk begins at 12:30

Abstract:  Epidemics of communicable diseases place a huge burden on public health infrastructures across the world. Advanced warnings of increases in disease incidence can, in many cases, help public health authorities allocate resources more effectively and mitigate the impact of epidemics. A central challenge in modeling infectious diseases is accurately estimating the (largely unobservable) population of individuals who are susceptible to infection. In this talk, I will review methods used to infer the susceptible fraction from aggregate time-series data and incorporate these estimates into models of disease transmission. Additionally, I will propose a new approach to accounting for the susceptible population over time based on observed case data. This method provides a simple way to include complex dynamics in otherwise standard statistical time-series models. Using over four decades of surveillance data on dengue fever infections from the Ministry of Public Health in Thailand, I will illustrate the ability of these methods to draw inference about mechanistic disease transmission models and predict the future spread of disease.

Bio:  Dr. Nicholas Reich is an Assistant Professor of Biostatistics at UMass-Amherst. His research focuses on developing statistical models for analyzing infectious disease time-series data. His recent research, in collaboration with the Ministry of Public Health in Thailand, has yielded important insights into the complex global dynamics of dengue fever and was featured in the New York Times.

Thursday, April 9, 2015 • 9:00 a.m.-4:30 p.m.
Life Sciences Laboratories, 6th Floor, University of Massachusetts Amherst

This event will bring together leaders in academia, industry, and government. Space is limted and registration is required. For additional information or to discuss any special needs, please contact the Office of External Relations and University Events at 413.577.1101 or

For additional details and to register please visit

Jesse Raffa

Jesse Raffa

Research Associate, University of Washington
Hidden State Inference and Prediction under a Mixed Effects Hidden Markov Model for Multivariate Longitudinal Data
Wednesday, April 8, 2015 • 10:00 a.m.
Arnold House, Room 113

Abstract:  Mixed effects hidden Markov models (MHMMs) have been applied to both univariate and multivariate longitudinal data, with particular applicability in studies of addiction. In such studies, the longitudinal data are collected to capture some feature of the underlying disease process (usually abstinence from substance use), which is often not directly measurable. MHMMs are able to model the heterogeneity in the longitudinal trajectories which may be due to dynamic changes in the underlying disease states via the hidden Markov process, and also any between-subject differences through random effects. We apply this approach to bivariate longitudinal data generated from a smoking cessation clinical trial, where smoking status was monitored longitudinally during the study through patient self-report and physiological monitoring (carbon monoxide). The focus of this talk will be on inference drawn from the hidden states, with particular attention paid to their possible interpretation, when compared to the study’s outcomes, and differences between the results of univariate and multivariate models. A carefully motivated simulation study shows that, even when the observed data is contaminated by an unmodeled state-specific process, our multivariate approach performs better in predicting the underlying true hidden states when compared to the individual univariate models. These results suggest that using a multivariate response MHMM may have particular relevance in studies of addiction where observed data may be contaminated.

Friday, March 27, 2015
Computer Science graduate student recruitment event
Babur De los Santos

Babur De los Santos

Assistant Professor of Business Economics and Public Policy, Kelley School of Business, Indiana University
E-Book Pricing and Vertical Restraints
Monday, March 9, 2015 • 10:00-11:15 a.m.
Stockbridge Hall Room 303

Abstract:  This paper empirically analyzes how the use of vertical price restraints has impacted retail prices in the market for e-books. In 2010 five of the six largest publishers simultaneously adopted the agency model of book sales, allowing them to directly set retail prices. This led the Department of Justice to file suit against the publishers in 2012, the settlement of which prevents the publishers from interfering with retailers’ ability to set e-book prices. Using a unique dataset of daily e-book prices for a large sample of books across major online retailers, we exploit cross-publisher variation in the timing of the return to the wholesale model to estimate its effect on retail prices. We find that e-book prices for titles that were previously sold using the agency model decreased by 18 percent at Amazon and 8 percent at Barnes & Noble. Our results are robust to different specifications, placebo tests, and synthetic control groups. Our findings illustrate a case where upstream firms prefer to set higher retail prices than retailers and help to clarify conflicting theoretical predictions on agency versus wholesale models.

Maryann Feldman

Seminar cancelled due to snow preventing travel. To be rescheduled.

Maryann Feldman

Heninger Distinguished Professor in the Department of Public Policy at the University of North Carolina
Intellectual Merits and Broader Impacts in Social, Behavioral, and Economic Sciences
Friday, March 6, 2015 • 12:00 p.m., Lunch will be provided
Massachusetts Room at Mullins Center

Abstract:  In an event co-sponsored with the Institute for Social Science Research, Department of Landscape Architecture and Regional Planning, and Office of Research Development, Maryann Feldman, NSF Program Officer for the Science of Science and Innovation Policy (SciSIP) program, will present information about NSF funding opportunities in the Social, Behavioral, and Economic (SBE) sciences.

Bio:  Maryann Feldman is the Heninger Distinguished Professor in the Department of Public Policy at the University of North Carolina and winner of the 2013 Global Award for Entrepreneurship Research, presented by the Swedish Entrepreneurship Forum and the Research Institute of Industrial Economics. Feldman's research and teaching interests focus on the areas of innovation, the commercialization of academic research and the factors that promote technological change and economic growth.

Friday, February 20, 2015
Friday, February 13, 2015 • 1:00 p.m.-3:00 p.m.
Hadley Room, Campus Center 10th Floor

Description:   Research Computing encompasses many of the most important skills, tools and resources drawn upon in contemporary social science. From PhD students just beginning their training to senior faculty looking to expand upon the toolkit used in their established research agendas, understanding the myriad research computing resources available to social scientists on campus can be critical to successful scholarship. In this event, organized by the Institute for Social Science Research and co-sponsored with the Office of Research Development and Computational Social Science Institute, representatives from organizations on campus will present the many resources and opportunities available to graduate students and faculty through their respective organizations. The event will include a 30min lunch, then a brief presentation from each of four panelists, followed by a 30-40min Q&A. The objective is to provide social scientists with a comprehensive overview of research computing resources and opportunities available on campus. Panelists represent the Institute for Social Science Research (ISSR), the Office of Information Technology (OIT), and the Massachusetts Green High Performance Computing Cluster (MGHPCC). Resources and opportunities discussed will include software and methods instruction and assistance (ISSR); hardware, data storage and management (OIT); and high performance computing resources (MGHPCC).

CSSI talks hosted by the Department of Resource Economics:

Anett Erdmann
The Role of Captive Consumers in Retailers' Location Choice
Monday, February 9, 2015 • 10:30 a.m.–11:45 a.m. • Stockbridge Hall Room 303
Marc Remer
The Determinants and Consequences of Search Cost Heterogeneity: Evidence from Local Gasoline Markets
Friday, February 6, 2015 • 10:30 a.m.–11:45 a.m. • Stockbridge Hall Room 303
Stefan Weiergräber
Network Effects and Switching Costs in the US Wireless Industry
Wednesday, February 4, 2015 • 10:30 a.m.–11:45 a.m. • Stockbridge Hall Room 303
Colin Hottman
Retail Markups, Misallocation, and Store Variety in the US
Thursday, January 29, 2015 • 10:00 a.m.–11:15 a.m. • Stockbridge Hall Room 303
Forrest Spence
Consumer Experience and the Value of Search in the Online Textbook Market
Monday, January 25, 2015 • 9:45 a.m.–11:00 a.m. • Stockbridge Hall Room 303
Davide Proserpio

Davide Proserpio

Ph.D. candidate, Department of Computer Science, Boston University
Online reputation management: Estimating the impact of management responses on hotel reviews
January 30, 2015 • 12:00pm-2:00PM
Computer Science Building, Room 150/151
Lunch will be provided, beginning at 12:00
Talk begins at 12:30

Abstract:  Failure to meet a consumer’s expectations can result in a negative review, which can have a lasting, damaging impact on a firm’s reputation, and its ability to attract customers. To mitigate the reputational harm of negative reviews many firms have adopted the strategy of responding to them. How effective is this reputation management strategy? We empirically answer this question by exploiting a difference in managerial practice across two hotel review platforms, TripAdvisor and Expedia: while hotels regularly respond to their TripAdvisor reviews, they almost never do so on Expedia. We exploit this distinction to identify the causal impact of management responses on reputation using difference-in-differences. We find that responding hotels see an average increase of 0.12 stars in the ratings they receive after they start responding. Moreover, we show that this increase is not due to hotel quality investments. Instead, we find that the increase is consistent with a shift in reviewer selection: consumers with a poor experience become less likely to leave a review when hotels begin responding. Our findings suggest that while management responses are an effective way to manage reputation, they can also obscure the measurement of a firm’s true quality by causing unfavorable reviews to be underreported.
Joint work with Giorgos Zervas (Boston University School of Management)

Bio:  Davide Proserpio is a fourth year Ph.D. candidate in the Department of Computer Science at Boston University where he is advised by Professor Sharon Goldberg and John Byers and he frequently collaborates with Professor Giorgos Zervas. His current research involves leveraging concepts from computer science, statistics and economics to study complex social systems. Davide received his bachelor in telecommunication engineering from Politecnico di Milano (Milan, Italy) and his master in engineering from Carlos III University (Madrid, Spain).

Fall 2014

Alex Hanna

Alex Hanna

Ph.D. Candidate, Department of Sociology, University of Wisconsin
Developing a System for the Automated Coding of Protest Event Data
Friday, December 5, 2014 • 12:00pm-2:00PM
Computer Science Building, Room 150/151
Lunch will be provided, beginning at 12:00
Talk begins at 12:30

Abstract:  Scholars and policy makers recognize the need for better and timelier data about contentious collective action, both the peaceful protests that are understood as part of democracy and the violent events that are threats to it. News media provide the only consistent source of information available outside government intelligence agencies and are thus the focus of all scholarly efforts to improve collective action data. Human coding of news sources is time-consuming and thus can never be timely and is necessarily limited to a small number of sources, a small time interval, or a limited set of protest "issues" as captured by particular keywords. There have been a number of attempts to address this need through machine coding of electronic versions of news media, but approaches so far remain less than optimal. The goal of this paper is to outline the steps needed to build, test and validate an open-source system for coding protest events from any electronically available news source using advances from natural language processing and machine learning. Such a system should have the effect of increasing the speed and reducing the labor costs associated with identifying and coding collective actions in news sources, thus increasing the timeliness of protest data and reducing biases due to excessive reliance on too few news sources. The system will also be open, available for replication, and extendable by future social movement researchers, and social and computational scientists.

Bio:  Alex Hanna is a PhD candidate in sociology at the University of Wisconsin-Madison. Substantively, they are interested in social movements, media, and the Middle East. Methodologically, they are interested in computational social science, textual analysis, and social network analysis. Alex's work has appeared in both social and computational science venues, including Mobilization, the ANNALS of the American Academy of Political and Social Science, and ICWSM. They also co-founded and contribute regularly to the computational social science blog Bad Hessian, where they write about Python, R, and Twitter.

Mark Pachucki

Mark Pachucki

Affiliated Faculty, Division of General Pediatrics,Massachusetts General Hospital
Instructor of Medicine and Pediatrics, Harvard Medical School
Physical activity and social influence in early adolescent networks: How measurement matters
Tuesday, December 2, 2014 • 11:30AM-1:00PM
Machmer W32

Abstract:  Physical activity (PA) is a modifiable health behavior that has been associated with cardiometabolic disorder, cancer, and distressed mental health. Among older youth it has been consistently shown that friends tend to be similar in their PA levels. However, research examining why this is the case has revealed inconsistencies in theorized mechanisms responsible for this similarity. Youth tend to form friendships based upon existing PA behaviors, but one’s peer group can also influence changes in a youth’s own PA. Given this context, surprisingly little research has explored how social relationships may shape PA during early adolescence considered as a discrete stage of the life course. Relative to childhood or late adolescence, early adolescence has unique properties in terms of social, psychological, biological, and neurological development that interact to shape decisions around health behaviors. Moreover, the majority of PA network research relies on PA and social relationships data obtained from self-report. In this paper, we compare early adolescent peer effects on accelerometer-measured PA using both cognitive and high-quality behavioral affiliation measures. Data were obtained by recruiting an unusually complete cohort of 6th-grade students and observing their behaviors at multiple points during a four-month period. Findings suggest that replacing network self-report with behavioral measures of social interaction yields a great deal more precision with lower respondent burden, but also that each type of network measurement has unique insights to offer.

Bio:  Mark C. Pachucki investigates how interpersonal relationships can shape health behaviors and outcomes across the life course. He is jointly appointed as Senior Scientist at the Mongan Institute for Health Policy at Massachusetts General Hospital and Affiliated Faculty with the Massachusetts General Hospital for Children Division of General Pediatrics, with academic appointments as Instructor in Medicine and Pediatrics at Harvard Medical School. His training is in sociology and social determinants of health, and current projects focus on relationship formation and dissolution and how peer and family relationships - specifically, between children and their peers, between parents and children, and between spouses - can influence the adoption of risky health behaviors. Pachucki has recently published his research in Social Science & Medicine, Annual Review of Sociology, Poetics, Sociologie et sociétés, the International Journal of Obesity, American Journal of Preventive Medicine, and American Journal of Public Health.

Justin H. Gross

Justin H. Gross

Assistant Professor of Political Science, UNC-Chapel Hill
Building Idea-oriented Measures of Ideology in Text
Tuesday, November 25 2014 • 10:00AM-11:30AM
Campus Center Room 805-809

Abstract:  In political communication, philosophy, and psychology, as well as in everyday life, the word "ideology" carries various different meanings. When working with text written by political leaders, members of opinion media, and others writing explicitly about their views and beliefs, we have a unique opportunity to study ideology directly as publicly articulated political philosophy (however crudely expressed). Using a corpus of over two hundred popular books by contemporary American ideologues, my coauthors and I develop two approaches to measurement that are oriented toward detecting differences in recognizable viewpoints. One is a two stage approach that first identifies a set of terms that are useful in distinguishing among recognized ideological classes, then represents new documents via hidden Markov model in order to estimate how often a speaker sounds like someone in each prototypical class. In the second approach, we begin by carefully defining key concepts and then use the corpus to identify words, phrases, and Boolean rules that are sensitive markers of the concepts, allowing us to measure the relative attention writers devote to abstract ideas on which ideologies are built, what we refer to as their "ideational agendas." I discuss and illustrate each method and identify challenges to effective implementation.

Bio:  Justin H. Gross is Assistant Professor of Political Science at UNC-Chapel Hill. His research interests include problems of measurement in networks, text, and survey data. Substantively, he is interested in political communication, particularly the competitive framing of policy issues by opinion media. He is currently working with computer scientists and social scientists on the development of semi-supervised techniques of frame detection for the study of large databases of news stories.

Jesse Rhodes

Jesse Rhodes

Associate Professor, Department of Political Science
The Politics of Class-Based Appeals in American Presidential Campaigns
Friday, October 24 2014 • 12:00pm-2:00PM
Computer Science Building, Room 150/151
Lunch will be provided, beginning at 12:00
Talk begins at 12:30

Abstract:  In politics, discussion of class is inexorably linked to matters relating to the distribution and redistribution of wealth. Consequently, class appeals are a powerful - but politically perilous - form of campaign rhetoric. Because class terms are heavily freighted with meaning and most Americans identify with one economic class or another, explicit appeals to class identities may serve to mobilize targeted groups in elections. However, because class groups that are mis-targeted may punish the speaker, candidates must take care in crafting class appeals.
Drawing on a new dataset of every explicit class reference by Democratic and Republican presidential candidates between 1952 and 2012, I use a variety of quantitative text analytic methods to examine the volume and topical content of candidates' class appeals, and account for variation in content between candidates and over time. The study sheds new light on the factors affecting presidential candidates' decisions to attempt to mobilize voters on the basis of class.

Bio:  Jesse Rhodes is associate professor in the Department of Political Science at the University of Massachusetts, Amherst. He maintains research and teaching interests in the areas of the American presidency, party politics, social policy, and American political development. His methods encompass a wide range of quantitative and qualitative methods, including large-scale text collection and analysis, survey research, interviewing, archival research, and historical analysis.

Krista Gile

Krista Gile

Assistant Professor, Department of Mathematics and Statistics
Inference and Diagnostics for Respondent-Driven Sampling Data
Friday, October 17, 2014 • 12:00pm-2:00PM
Computer Science Building, Room 150/151
Lunch will be provided, beginning at 12:00
Talk begins at 12:30

Abstract:  Respondent-Driven Sampling is type of link-tracing network sampling used to study hard-to-reach populations. Beginning with a convenience sample, each person sampled is given 2-3 uniquely identified coupons to distribute to other members of the target population, making them eligible for enrolment in the study. This is effective at collecting large diverse samples from many populations.
Unfortunately, sampling is affected by many features of the network and sampling process. In this talk, we present advances in sample diagnostics for these features, as well as advances in inference adjusting for such features.

Bio:  Krista J. Gile is Assistant Professor of Statistics at UMass Amherst. Her research focuses on developing statistical methodology for social and behavioral science research, particularly related to making inference from partially-observed social network structures. Most of her current work is focused on understanding the strengths and limitations of data sampled with link-tracing designs such as snowball sampling, contact tracing, and respondent-driven sampling.

David Jensen

David Jensen

School of Computer Science, University of Massachusetts
Using Graphical Models to Reason About Quasi-Experimental Designs
Friday, October 10, 2014 • 12:00PM-2:00PM
Computer Science Building, Room 150/151
Lunch will be provided, beginning at 12:00
Talk begins at 12:30

Abstract:  Effective methods for inferring causal dependence from observational data have been developed within both computer science and quantitative social science. Methods in computer science have focused on the correspondence between casual graphical models and observed patterns of statistical association. Methods in social science have focused on templates for causal inference often called quasi-experimental designs, including designs that use instrumental variables, propensity scores matching, regression discontinuity, and interrupted time-series. In this talk, I will describe many of the known experimental and quasi-experimental designs in the language of directed graphical models, and I will show how the graphical model framework allows effective reasoning about threats to validity in these designs. Finally, I will present two novel designs that have resulted from our recent work on causal inference in relational data.

Bio:  David Jensen is Associate Professor of Computer Science and Director of the Knowledge Discovery Laboratory at the University of Massachusetts Amherst. He received his doctorate from Washington University in St. Louis in 1992. From 1991 to 1995, he served as an analyst with the Office of Technology Assessment, an agency of the United States Congress. His research focuses on machine learning and causal inference in relational data sets, with applications to social network analysis, computational social science, fraud detection, and management of large technical systems. He has served on the Executive Committee of the ACM Special Interest Group on Knowledge Discovery and Data Mining and on the program committees of the International Conference on Machine Learning, the International Conference on Knowledge Discovery and Data Mining, and the Uncertainty in AI Conference. He was a member of the 2006-2007 Defense Science Study Group, and served for six years on DARPA's Information Science and Technology (ISAT) Group. He is the incoming Associate Director of the UMass Computational Social Science Institute. He won the 2011 Outstanding Teaching Award from the UMass College of Natural Science.

Rodrigo Zamith

Rodrigo Zamith

Doctoral candidate, School of Journalism and Mass Communication, University of Minnesota
What Computational Social Science Means for Traditional Modes of Media Analysis
Friday, October 3, 2014 • 12:00PM-2:00PM
Computer Science Building, Room 150/151
Lunch will be provided, beginning at 12:00
Talk begins at 12:30

Abstract:  The abundance of digitized data has become a defining feature of modern life, and particularly of modern communication as it is expressed through digital, social, and mobile platforms. For communication and media research, in particular, the possibilities of understanding communicative practices, social behavior, and the diffusion of information are great. However, this abundance also brings with it a number of challenges for media scholars as they struggle to deal with ever-larger volumes of data and seek out computational solutions. In this talk, I focus on the following question: What does this computational turn mean for traditional modes of content analysis? Specifically, I consider the traditional (manual) approach of conducting a content analysis–a primary method in the study of media messages–in light of the proliferation of computer-centric approaches, assess what is gained and lost in turning to predominantly computational solutions, and discuss an alternative approach that aims to effectively combine traditional and computational modes to facilitate more expansive and powerful–yet still reliable and meaningful–analyses of media content.

Bio:  Rodrigo Zamith is a doctoral candidate in the School of Journalism and Mass Communication at the University of Minnesota. His research focuses on the reconfiguration of journalism and the development of digital research methods. His work has been published in the Journal of Computer-Mediated Communication, the Journal of Broadcasting and Electronic Media, and Digital Journalism.

Peter Dodds

Peter Dodds

Professor, University of Vermont
Measuring Happiness, Health, and Social Stories, the Big Data Way
Friday, September 19, 2014 • 12:00PM-2:00PM
Computer Science Building, Room 150/151
Lunch will be provided, beginning at 12:00
Talk begins at 12:30

Abstract:  In this talk, I will report on a wide array of findings obtained through our real-time, remote-sensing, non-invasive, text-based `hedonometer'---an instrument for measuring positivity in written expression, soon to be housed online at I'll show how we have improved our methods to allow us to explore collective, dynamical patterns of happiness found in massive text corpora including the global social network Twitter, song lyrics, blogs, political speeches, and news sources. From the viewpoint of Twitter, I will report on global levels of temporal, spatial, demographic, and social variations in happiness and information levels, as well as evidence of emotional synchrony and contagion. I will also discuss how natural language appears to contain a striking frequency-independent positive bias, how this phenomenon plays a key role in our instrument's performance, and its connections with collective cooperation and evolution.

Bio:  Peter Sheridan Dodds is a Professor at the University of Vermont (UVM) working on system-level problems in many fields, ranging from sociology to physics. He is Director of the UVM's Complex Systems Center, co-Director of UVM's Computational Story Lab, and a visiting faculty fellow at the Vermont Advanced Computing Core. He maintains general research and teaching interests in complex systems and networks with a current focus on sociotechnical and psychological phenomena including collective emotional states, contagion, and stories. His methods encompass large-scale sociotechnical experiments, large-scale data collection and analysis, and the formulation, analysis, and simulation of theoretical models. Dodds's training is in theoretical physics, mathematics, and electrical engineering with formal postdoctoral and research experience in the social sciences. Dodds is currently funded by an NSF CAREER grant awarded by the Social and Economic Sciences Directorate.

Faculty Convocation
Friday, September 12, 2014 • 11:00AM
Bowker Auditorium, Stockbridge Hall

During the ceremony, Provost Katherine Newman will present the keynote address and eight nationally acclaimed faculty members will be presented with the Award for Outstanding Accomplishments in Research and Creative Activity.

Michael Ash, Economics; Center for Public Policy and Administration
W. Bruce Croft, Computer Science
David R. Evans, Educational Policy, Research, and Administration
Lyn Frazier, Linguistics
Panayotis Kevrekidis, Mathematics and Statistics
Barbara Krauthamer, History
Young Min Moon, Art, Architecture and Art History
Shelly Peyton, Chemical Engineering

For more information, contact or phone 413-577-1101.

Spring 2014

What Can Twitter Tell Us About the "Real World"?
Monday, February 24, 2014 • 12:00PM-2:00PM • Lunch provided
Computer Science Bulding, Room 150/151

Abstract:   Due to Twitter's global popularity and the relative ease with which large amounts of tweets can be collected and analyzed, more and more researchers turn to Twitter as a data source for studies in Computational Social Science. But at the same time it is obvious that Twitter users are not representative of the overall population. So the question arises what Twitter can really tell us about the "Real World" beyond teens' obsession with Justin Bieber. In this talk, I will give an overview of some past and present research done at the Qatar Computing Research Institute (QCRI) which tries to find links between the online world and the offline world.

The first line of work looks at political tension in Egypt. Is it possible to quantify tension in a polarized society and maybe even predict outbreaks of violence? Based on our methodology we find evidence that monitoring the extreme poles can give indications about periods of violence.

Migration is one the major driving forces behind demographic changes around the world. In this second line of work we turn to online data and digital methods to see if we can quantify certain aspects of migration for a large number of countries and faster than typical reporting latencies of often more than a year.

A popular saying is that you are what you eat. We study if you also tweet what you eat and if it is possible to study food consumption using Twitter. Here, we are particularly interested in questions related to obesity and if there are "networks effects", but also in questions related to demographic influences such as income.

Bio:  Ingmar Weber is a senior scientist in the Social Computing Group at the Qatar Computing Research Institute (QCRI). He enjoys interdisciplinary research that uses "big data" and computer science methods to address research questions coming from other domains. His work focuses on how user-generated online data can be used to answer questions about society at large and the offline world in general. During his academic career he has gradually moved further South with stops at 52.2°N (Cambridge University), 49.2°N (Max-Planck Institute for Computer Science), 46.5°N (EPFL), 41.4°N (Yahoo! Research Barcelona) and 25.3°N (QCRI). Ingmar is co-organizer of the "Politics, Elections and Data" (PLEAD) workshop at CIKM 2012 and 2013, contributor to a WSDM 2013 tutorial on "Data-driven Political Science", co-editor of a Social Science Computing Review special issue on "Quantifying Politics Using Online Data", co-organizer of a CIKM 2013 tutorial on "Twitter and the Real World" and PC Co-Chair of SocInfo 2014. He has published more than 60 peer-reviewed articles and his research has been featured on Financial Times, New Scientist, Foreign Policy, Al Jazeera and other media. He loves chocolate, enjoys participating in the occasional ultra-marathon/triathlon and tweets at @ingmarweber.

The Origins of Common Sense: Modeling Human Intelligence with Probabilistic Programs and Program Induction
Friday, April 18, 2014 • 12:00PM-2:00PM
Computer Science Building, Room 150/151
Lunch will be provided, beginning at 12:00
Talk begins at 12:30

Abstract:  Our work seeks to understand the roots of human common-sense thought by looking at the core cognitive capacities and learning mechanisms of young children and infants. We build computational models of these capacities with the twin goals of explaining human thought in more principled, rigorous "reverse engineering" terms, and engineering more human-like AI and machine learning systems. This talk will focus on two ways in which the intelligence of very young children goes beyond existing machine systems: (1) Scene understanding, where we can detect not only objects and their locations, but what is happening, what will happen next, who is doing what to whom and why, in terms of our intuitive theories of physics (forces, masses) and psychology (beliefs, desires, ...); (2) Learning concepts from examples, where just a single example is often sufficient to grasp a new concept and generalize in richer ways than machine learning systems can typically do even with hundreds or thousands of examples. I will show how we are beginning to capture these reasoning and learning abilities in computational terms using techniques based on probabilistic programs and program induction, embedded in a broadly Bayesian framework for inference under uncertainty.

Bio:  Joshua B. Tenenbaum received his Ph.D. in 1999 from MIT in the Department of Brain and Cognitive Sciences, where he is currently Professor of Computational Cognitive Science as well as a principal investigator in the Computer Science and Artificial Intelligence Laboratory (CSAIL). He studies learning, reasoning and perception in humans and machines, with the twin goals of understanding human intelligence in computational terms and bringing computers closer to human capacities. He and his collaborators have pioneered accounts of human cognition based on sophisticated probabilistic models, and have also developed several novel machine learning algorithms inspired by human learning. His papers have received awards at numerous conferences, including the IEEE Computer Vision and Pattern Recognition (CVPR) conference, Neural Information Processing Systems (NIPS), the Annual Meeting of the Cognitive Science Society, Uncertainty in AI (UAI), the International Joint Conference on Artificial Intelligence (IJCAI), and the International Conference on Development and Learning (ICDL). He is the recipient of early career awards from the Society for Mathematical Psychology, the Society of Experimental Psychologists, and the American Psychological Association, along with the Troland Research Award from the National Academy of Sciences. He is a fellow of the Society of Experimental Psychologists and the Cognitive Science Society.

Understanding Online Video Users: A Key to the Future of the Internet
Friday, April 25, 2014 • 12:00PM-2:00PM
Computer Science Building, Room 150/151
Lunch will be provided, beginning at 12:00
Talk begins at 12:30

Abstract:  Online video is the killer application of the Internet. It is predicted that more than 85% of the consumer traffic on the Internet will be video-related by 2016. But, can online videos ever be fully monetized? The future economic viability of online videos rest squarely on our ability to understand how viewers interact with video content. For instance: If a video fails to start up quickly, would the viewer abandon? If a video freezes in the middle, would the viewer watch fewer minutes of it? Where should video ads be inserted to ensure that they are watched to conclusion? Are ads in movies more likely to be watched than ads in short news clips? In this talk, we outline scientific answers to these and other such questions. The largest study of its kind, our work analyzes the video viewing habits of over 65 million unique users who in aggregate watched almost 367 million videos. To go beyond correlation and to establish causality, we develop a novel technique based on quasi-experimental designs (QEDs). While QEDs are well known in the medical and social sciences, our work represents its first use in network performance research and is of independent interest.

Bio:  Prof. Ramesh K. Sitaraman is currently in the School of Computer Science at the University of Massachusetts at Amherst. He is best known for his role in pioneering the first large content delivery networks (CDNs) that currently deliver a significant fraction of the world’s web content, streaming videos, and online applications. As a principal architect, he helped create Akamai's distributed network and is an Akamai Fellow. His research focuses on all aspects of Internet-scale distributed networks, including algorithms, architectures, performance, energy efficiency, user behavior, and economics. He received a B. Tech. in electrical engineering from the Indian Institute of Technology, Madras. and a Ph.D. in computer science from Princeton University.

Fall 2013

The Numbers Race: Academic Excellence and Bibliometric Tools.
Friday, October 25, 2013 • 12:00PM–2:00PM • Lunch provided
Computer Science Bulding, Room 150/151

Abstract:  In contrast with the image of bibliometrics as a unified and coherent whole, Didier Torny identifies three major components of information infrastructures that support the evaluation of scientific research: algorithms, datasets, and bibliometric tools. Tracing the genesis and success of some of these components from 1960 to the present, Torny regards several configurations that differentially articulate the position of dataset producers, the target of algorithms, and the form of bibliometric tools. This history shows repeated encounters between bibliometrics and webometrics and highlights the fact that any assessment is necessarily based on a definition of relevant peers.

Bio:  Didier Torny heads the RITME research unit of INRA, France’s National Institute of Agronomic Research. As a sociologist of scientific knowledge, Torny’s newest work examines research evaluation, studying the making, use, and critique of evaluation tools and norms. He has published extensively on private and public normative action relating to health risks, such as norms that attempt to impose human or animal health as a legitimate objective for business regulation. His research is based on a comparative approach of cases from the arenas of nutrition, food safety, animal health, and safety of health products as a series of pertinent examples by which one comes to grips with the mechanisms used in collective risk management.

Spring 2013

Inequality and Representation in the U.S. Congress: Insights from a New Population-Level Data Resource [video]
Friday, February 1, 2013 • 12:30PM–2PM • Lunch provided
Campus Center, Room 917

Abstract:  With economic inequality at its highest level since the Great Depression, it is critical to understand if our elected officials remain responsive to citizens across the spectrum of wealth, or if they cater primarily to wealthy constituents. While several recent studies have addressed this important subject, existing research suffers from three critical methodological limitations—(1) relatively coarse measures of income and wealth, (2) small numbers of respondents in each constituency, and (3) modest insight on the mechanisms linking income and wealth with influence—that limit their ability to draw inferences about the relationship between inequality and representation. Given these drawbacks, scholars are sharply divided on whether or not rising economic inequality has resulted in more unequal democracy. This study draws on population-level data from a relatively new data source to advance our understanding of the relationship between economic inequality and political representation. Specifically, we use data from Catalist, the pre-eminent political data vendor in the United States today. Catalist maintains an up-to-date file that includes individual-level political, commercial, and demographic data for virtually every American adult. Catalist recently began offering academic access to their database, and this database has already been used by scholars in a variety of applications. In this talk, I will describe how we are using the Catalist database to address the "small N" problems that have plagued the research on inequality and representation and I will present preliminary findings from this research. Thus, the talk will provide both an introduction to a new resource for "big data social science" and as a substantive examination of whether elected officials in the U.S. are are differentially responsive to the wealthiest individuals. (Based on research being conducted with Jesse Rhodes and Ray La Raja)

Bio:  Brian Schaffner is Associate Professor and Chair of the the Department of Political Science and Director of the UMass Poll. He is also a Faculty Associate at the Institute for Quantitative Social Science at Harvard University, an Honorary Instructor at the University of Essex in the United Kingdom, and was formerly Program Director for the Political Science program at the National Science Foundation. Schaffner's research focuses public opinion, campaigns and elections, and survey research. He is author of the textbook Politics, Parties, and Elections in America and his research has appeared in over 20 articles in the top journals in the discipline.

Grand Challenges and Opportunities in Supply Chain Networks: From Analysis to Design [video]
Friday, Februrary 15, 2013 • 12:30PM–2PM • Lunch provided
Campus Center, Room 917

Abstract:  Supply chain networks provide the backbones for our economies since they involve the production, storage, and distribution of products as varied as vaccines and medicines, food, high tech products, automobiles, clothing, and even energy. Many of the supply chains today are global in nature and time-sensitive and present challenging aspects for modeling, analysis, and computations. In this talk, I will discuss different perspectives for supply chain network analytics based on centralized vs. decentralized decision-making behavior, and will highlight paradoxes, along with suitable methodological frameworks. I will also describe applications of our research to empirical electric power supply chains, to mergers and acquisitions, to supply chains in nature, and even to humanitarian logistics and health care applications from blood supply chains to medical nuclear ones. Such timely issues as risk management, demand uncertainty, outsourcing, and disruption management in the context of our recent research on supply chain network design and redesign will also be discussed. Suggestions for new directions and opportunities will conclude this talk.

Bio:  Anna Nagurney is the John F. Smith Memorial Professor in the Department of Finance and Operations Management in the Isenberg School of Management at the University of Massachusetts Amherst. She is also the Founding Director of the Virtual Center for Supernetworks and the Supernetworks Laboratory for Computation and Visualization at UMass Amherst. She is an Affiliated Faculty Member of the Department of Civil and Environmental Engineering and the Department of Mechanical and Industrial Engineering at UMass Amherst. She received her AB, ScB, ScM, and PhD degrees from Brown University in Providence, Rhode Island. She devotes her career to education and research that combines operations research / management science, engineering, and economics. Her focus is the applied and theoretical aspects of network systems, particularly in the areas of transportation and logistics, critical infrastructure, and in economics and finance.

Fear and Loathing on the Social Campaign Trail [video]
Friday, Februray 22, 2013 • 12:30PM–2PM • Lunch provided
Campus Center, Room 917

Abstract:  What were voters afraid of on the eve of the 2012 election? Fear is one of the most freely expressed forms of sentiment in social media. This "Voice of the Voter" presentation looks social data collected in the final week of October 2012 and speaks to the nature and salience of fear among the electorate. Bridging history, political science, and computational science, Dr. Shulman will present a frightening array of scenarios predicted in the social media updates as the final phase of the campaign transpired.

Bio:  Dr. Stuart Shulman is a political science professor, software inventor, entrepreneur, and garlic growing enthusiast who coaches U11 boys club soccer for FC Massachusetts with a national D-license. He is the VP for Text Analytics at Vision Critical, founder & CEO of Texifter, LLC, Director of QDAP-UMass, and Editor Emeritus of the Journal of Information Technology & Politics. Stu is the proud owner of a Bernese/Shepherd named "Colbert" who goes by ''Bert. You can follow his exploits @stuartwshulman or @DiscoverText. Dr. Shulman is the sole inventor of the Coding Analysis Toolkit (CAT), a free, open source, Web-based text analysis software project, as well as the Public Comment Analysis toolkit (PCAT), and a new analytic network known as DiscoverText, which was recently acquired by Vision Critical. The QDAP labs at UMass and the University of Pittsburgh are fee-for-service coding labs that work on projects funded by the National Science Foundation (NSF), the National Institutes of Health (NIH) and Mental Health (NIMH), the Smithsonian, and other U.S. funding agencies. Dr. Shulman has been the Principal Investigator and Project Director on numerous National Science Foundation-funded research projects focusing on electronic rulemaking, human language technologies, manual annotation, digital citizenship, and service-learning efforts in the United States.

From the Right-to-Know to the Right to Clean Air and Water (With much data management in between) [video]
Friday, March 1, 2013 • 12:30PM–2PM • Lunch provided
Campus Center, Room 917

Abstract:  The US depends heavily on right-to-know legislation, rather than direct regulation, to protect citizens against industrial toxic pollution. The right-to-know approach means that corporations are under mandate to publicly report their pollution, but after the reports are filed and published, citizens, employees, consumers, shareholders and managers are left to respond as they see fit. For the right-to-know approach to improving corporate environmental performance to have any chance of success, stakeholders must have access to the information, the ability to interpret the information, and the capacity and incentive to respond to the information. The Corporate Toxics Information Project (CTIP) adds value to data collected and processed by the EPA and presents the data in new forms useful to multiple constituencies whose actions affect public exposure to industrial toxic pollution. The Toxic 100 Air Polluters ( is an example of this data presentation.

Bio:  Michael Ash is Professor of Economics and Public Policy and Chair of the Economics Department. With Professor James Boyce, he co-directs the Corporate Toxics Information Project (CTIP; at the Political Economic Research Institute.

Identifying human inductive biases [video]
Friday, April 5, 2013 • 12:30PM–2PM • Lunch provided
Campus Center, Room 917

Abstract:  People are remarkably good at acquiring complex knowledge from limited data, as is required in learning causal relationships, categories, or aspects of language. Successfully solving inductive problems of this kind requires having good "inductive biases" - constraints that guide inductive inference. Viewed abstractly, understanding human learning requires identifying these inductive biases and exploring their origins. I will argue that probabilistic models of cognition provide a framework that can facilitate this project, giving a transparent characterization of the inductive biases of ideal learners. I will outline how probabilistic models are traditionally used to solve this problem, and then present a new approach that uses Markov chain Monte Carlo algorithms as the basis for an experimental method that magnifies the effects of inductive biases. This approach provides some surprising insights into how information changes through cultural transmission (relevant to understanding processes like language evolution) and shows how ideas from computer science and statistics can lead to new empirical paradigms for cognitive science research.

Bio:  Tom Griffiths is an Associate Professor of Psychology and Cognitive Science at the University of California, Berkeley. His research explores mathematical models of higher level cognition, with the goal of understanding the formal principles that underlie our ability to solve the computational problems we face in everyday life. His current focus is on inductive problems, such as probabilistic reasoning, learning causal relationships, acquiring and using language, and inferring the structure of categories. He tries to analyze these aspects of human cognition by comparing human behavior to optimal or "rational" solutions to the underlying computational problems. For inductive problems, this usually means exploring how ideas from artificial intelligence, machine learning, and statistics (particularly Bayesian statistics) connect to human cognition.

Using matched employee-employer data to measure labor mobility and knowledge flows in supply-chain and labor-based industry clusters
Friday, April 19, 2013 • 12:30PM–2PM • Lunch provided
Campus Center, Room 904-08

Abstract:  The “Industry Cluster” framework has been the dominant paradigm guiding state economic and workforce development policy decisions over the past two decades. An industry cluster is a geographic concentration of interconnected businesses and associated institutions. It is widely believed that clustering results in higher productivity and faster rates of innovation, because co-located businesses are able to access to deeper pools of skilled workers and/or intermediate goods and service providers. They may also learn from the successes and failures of neighboring businesses. Thus policies designed to strengthen businesses in one industry may have beneficial ‘spillover’ effects on other industries. Underlying the industry cluster approach are methodological concerns over how to identify potentially symbiotic industries. Access to relevant data on potential inter-industry synergies is a major limiting factor. This study uses a rare and confidential database of matched employee-employer records from the State of Maine to explore how workers transfer knowledge and skills across employers and industries. Because workers are a primary vehicle of knowledge exchange, the analysis of labor mobility can help identify businesses with common skill requirements, production methods, or other technological foundations. Beyond the substantive aspect of the research, this presentation will also discuss the use of large and confidential government databases for academic research.

Bio:  Dr. Renski is an Assistant Professor in the Department of Landscape Architecture and Regional Planning at the University of Massachusetts Amherst, the Director of the UMASS Center for Economic Development, and the Resident Methodologist of the UMASS Institute for Social Science Research. His research interests include regional influences on entrepreneurship, changing knowledge and skill requirement in the labor force, industrial cluster analysis, applied analytical methods, and state and local economic development policy. Prior to joining UMASS, Dr. Renski served as a Research Economist with the Maine State Planning Office and as the Deputy Program Manager of Maine’s North Star Alliance initiative.

Computational Social Science Poster Session
Friday, April 26, 2013 • 12:30PM–2PM • Lunch provided
LGRT Room 1634

Abstract:  This seminar will consist of a student poster session, with posters presenting class projects from POLISCI 791P (Political Network Analysis, taught by Bruce Desmarais) and STAT 697NS (Network Statistics, taught by Krista Gile).

Past CSSI Seminars :

Fall 2012

Spring 2012

Fall 2011

Spring 2011