Lecture series generously sponsored by Yahoo!
What Can Twitter Tell Us About the "Real World"?Monday, February 24, 2014 • 12:00PM-2:00PM • Lunch provided
Computer Science Bulding, Room 150/151
Abstract: Due to Twitter's global popularity and the relative ease with which large amounts of tweets can be collected and analyzed, more and more researchers turn to Twitter as a data source for studies in Computational Social Science. But at the same time it is obvious that Twitter users are not representative of the overall population. So the question arises what Twitter can really tell us about the "Real World" beyond teens' obsession with Justin Bieber. In this talk, I will give an overview of some past and present research done at the Qatar Computing Research Institute (QCRI) which tries to find links between the online world and the offline world.
The first line of work looks at political tension in Egypt. Is it possible to quantify tension in a polarized society and maybe even predict outbreaks of violence? Based on our methodology we find evidence that monitoring the extreme poles can give indications about periods of violence.
Migration is one the major driving forces behind demographic changes around the world. In this second line of work we turn to online data and digital methods to see if we can quantify certain aspects of migration for a large number of countries and faster than typical reporting latencies of often more than a year.
A popular saying is that you are what you eat. We study if you also tweet what you eat and if it is possible to study food consumption using Twitter. Here, we are particularly interested in questions related to obesity and if there are "networks effects", but also in questions related to demographic influences such as income.
Bio: Ingmar Weber is a senior scientist in the Social Computing Group at the Qatar Computing Research Institute (QCRI). He enjoys interdisciplinary research that uses "big data" and computer science methods to address research questions coming from other domains. His work focuses on how user-generated online data can be used to answer questions about society at large and the offline world in general. During his academic career he has gradually moved further South with stops at 52.2°N (Cambridge University), 49.2°N (Max-Planck Institute for Computer Science), 46.5°N (EPFL), 41.4°N (Yahoo! Research Barcelona) and 25.3°N (QCRI). Ingmar is co-organizer of the "Politics, Elections and Data" (PLEAD) workshop at CIKM 2012 and 2013, contributor to a WSDM 2013 tutorial on "Data-driven Political Science", co-editor of a Social Science Computing Review special issue on "Quantifying Politics Using Online Data", co-organizer of a CIKM 2013 tutorial on "Twitter and the Real World" and PC Co-Chair of SocInfo 2014. He has published more than 60 peer-reviewed articles and his research has been featured on Financial Times, New Scientist, Foreign Policy, Al Jazeera and other media. He loves chocolate, enjoys participating in the occasional ultra-marathon/triathlon and tweets at @ingmarweber.
Didier Torny, Research Director for Risk, Work, Markets, and the State at the French National Research Institute (INRA)
The Numbers Race: Academic Excellence and Bibliometric Tools.Friday, October 25, 2013 • 12:00PM–2:00PM • Lunch provided
Computer Science Bulding, Room 150/151
Abstract: In contrast with the image of bibliometrics as a unified and coherent whole, Didier Torny identifies three major components of information infrastructures that support the evaluation of scientific research: algorithms, datasets, and bibliometric tools. Tracing the genesis and success of some of these components from 1960 to the present, Torny regards several configurations that differentially articulate the position of dataset producers, the target of algorithms, and the form of bibliometric tools. This history shows repeated encounters between bibliometrics and webometrics and highlights the fact that any assessment is necessarily based on a definition of relevant peers.
Bio: Didier Torny heads the RITME research unit of INRA, France’s National Institute of Agronomic Research. As a sociologist of scientific knowledge, Torny’s newest work examines research evaluation, studying the making, use, and critique of evaluation tools and norms. He has published extensively on private and public normative action relating to health risks, such as norms that attempt to impose human or animal health as a legitimate objective for business regulation. His research is based on a comparative approach of cases from the arenas of nutrition, food safety, animal health, and safety of health products as a series of pertinent examples by which one comes to grips with the mechanisms used in collective risk management.
Inequality and Representation in the U.S. Congress: Insights from a New Population-Level Data Resource [video]Friday, February 1, 2013 • 12:30PM–2PM • Lunch provided
Campus Center, Room 917
Abstract: With economic inequality at its highest level since the Great Depression, it is critical to understand if our elected officials remain responsive to citizens across the spectrum of wealth, or if they cater primarily to wealthy constituents. While several recent studies have addressed this important subject, existing research suffers from three critical methodological limitations—(1) relatively coarse measures of income and wealth, (2) small numbers of respondents in each constituency, and (3) modest insight on the mechanisms linking income and wealth with influence—that limit their ability to draw inferences about the relationship between inequality and representation. Given these drawbacks, scholars are sharply divided on whether or not rising economic inequality has resulted in more unequal democracy. This study draws on population-level data from a relatively new data source to advance our understanding of the relationship between economic inequality and political representation. Specifically, we use data from Catalist, the pre-eminent political data vendor in the United States today. Catalist maintains an up-to-date file that includes individual-level political, commercial, and demographic data for virtually every American adult. Catalist recently began offering academic access to their database, and this database has already been used by scholars in a variety of applications. In this talk, I will describe how we are using the Catalist database to address the "small N" problems that have plagued the research on inequality and representation and I will present preliminary findings from this research. Thus, the talk will provide both an introduction to a new resource for "big data social science" and as a substantive examination of whether elected officials in the U.S. are are differentially responsive to the wealthiest individuals. (Based on research being conducted with Jesse Rhodes and Ray La Raja)
Bio: Brian Schaffner is Associate Professor and Chair of the the Department of Political Science and Director of the UMass Poll. He is also a Faculty Associate at the Institute for Quantitative Social Science at Harvard University, an Honorary Instructor at the University of Essex in the United Kingdom, and was formerly Program Director for the Political Science program at the National Science Foundation. Schaffner's research focuses public opinion, campaigns and elections, and survey research. He is author of the textbook Politics, Parties, and Elections in America and his research has appeared in over 20 articles in the top journals in the discipline.
Grand Challenges and Opportunities in Supply Chain Networks: From Analysis to Design [video]Friday, Februrary 15, 2013 • 12:30PM–2PM • Lunch provided
Campus Center, Room 917
Abstract: Supply chain networks provide the backbones for our economies since they involve the production, storage, and distribution of products as varied as vaccines and medicines, food, high tech products, automobiles, clothing, and even energy. Many of the supply chains today are global in nature and time-sensitive and present challenging aspects for modeling, analysis, and computations. In this talk, I will discuss different perspectives for supply chain network analytics based on centralized vs. decentralized decision-making behavior, and will highlight paradoxes, along with suitable methodological frameworks. I will also describe applications of our research to empirical electric power supply chains, to mergers and acquisitions, to supply chains in nature, and even to humanitarian logistics and health care applications from blood supply chains to medical nuclear ones. Such timely issues as risk management, demand uncertainty, outsourcing, and disruption management in the context of our recent research on supply chain network design and redesign will also be discussed. Suggestions for new directions and opportunities will conclude this talk.
Bio: Anna Nagurney is the John F. Smith Memorial Professor in the Department of Finance and Operations Management in the Isenberg School of Management at the University of Massachusetts Amherst. She is also the Founding Director of the Virtual Center for Supernetworks and the Supernetworks Laboratory for Computation and Visualization at UMass Amherst. She is an Affiliated Faculty Member of the Department of Civil and Environmental Engineering and the Department of Mechanical and Industrial Engineering at UMass Amherst. She received her AB, ScB, ScM, and PhD degrees from Brown University in Providence, Rhode Island. She devotes her career to education and research that combines operations research / management science, engineering, and economics. Her focus is the applied and theoretical aspects of network systems, particularly in the areas of transportation and logistics, critical infrastructure, and in economics and finance.
Fear and Loathing on the Social Campaign Trail [video]Friday, Februray 22, 2013 • 12:30PM–2PM • Lunch provided
Campus Center, Room 917
Abstract: What were voters afraid of on the eve of the 2012 election? Fear is one of the most freely expressed forms of sentiment in social media. This "Voice of the Voter" presentation looks social data collected in the final week of October 2012 and speaks to the nature and salience of fear among the electorate. Bridging history, political science, and computational science, Dr. Shulman will present a frightening array of scenarios predicted in the social media updates as the final phase of the campaign transpired.
Bio: Dr. Stuart Shulman is a political science professor, software inventor, entrepreneur, and garlic growing enthusiast who coaches U11 boys club soccer for FC Massachusetts with a national D-license. He is the VP for Text Analytics at Vision Critical, founder & CEO of Texifter, LLC, Director of QDAP-UMass, and Editor Emeritus of the Journal of Information Technology & Politics. Stu is the proud owner of a Bernese/Shepherd named "Colbert" who goes by ''Bert. You can follow his exploits @stuartwshulman or @DiscoverText. Dr. Shulman is the sole inventor of the Coding Analysis Toolkit (CAT), a free, open source, Web-based text analysis software project, as well as the Public Comment Analysis toolkit (PCAT), and a new analytic network known as DiscoverText, which was recently acquired by Vision Critical. The QDAP labs at UMass and the University of Pittsburgh are fee-for-service coding labs that work on projects funded by the National Science Foundation (NSF), the National Institutes of Health (NIH) and Mental Health (NIMH), the Smithsonian, and other U.S. funding agencies. Dr. Shulman has been the Principal Investigator and Project Director on numerous National Science Foundation-funded research projects focusing on electronic rulemaking, human language technologies, manual annotation, digital citizenship, and service-learning efforts in the United States.
From the Right-to-Know to the Right to Clean Air and Water (With much data management in between) [video]Friday, March 1, 2013 • 12:30PM–2PM • Lunch provided
Campus Center, Room 917
Abstract: The US depends heavily on right-to-know legislation, rather than direct regulation, to protect citizens against industrial toxic pollution. The right-to-know approach means that corporations are under mandate to publicly report their pollution, but after the reports are filed and published, citizens, employees, consumers, shareholders and managers are left to respond as they see fit. For the right-to-know approach to improving corporate environmental performance to have any chance of success, stakeholders must have access to the information, the ability to interpret the information, and the capacity and incentive to respond to the information. The Corporate Toxics Information Project (CTIP) adds value to data collected and processed by the EPA and presents the data in new forms useful to multiple constituencies whose actions affect public exposure to industrial toxic pollution. The Toxic 100 Air Polluters (http://toxic100.org) is an example of this data presentation.
Bio: Michael Ash is Professor of Economics and Public Policy and Chair of the Economics Department. With Professor James Boyce, he co-directs the Corporate Toxics Information Project (CTIP; http://www.peri.umass.edu/ctip_research/) at the Political Economic Research Institute.
CSSI Mini-Conference Mixer SessionFriday, March 8, 2013 • 12:30PM–3PM • Lunch provided
Campus Center -- Invite/RSVP-only Please
Identifying human inductive biases [video]Friday, April 5, 2013 • 12:30PM–2PM • Lunch provided
Campus Center, Room 917
Abstract: People are remarkably good at acquiring complex knowledge from limited data, as is required in learning causal relationships, categories, or aspects of language. Successfully solving inductive problems of this kind requires having good "inductive biases" - constraints that guide inductive inference. Viewed abstractly, understanding human learning requires identifying these inductive biases and exploring their origins. I will argue that probabilistic models of cognition provide a framework that can facilitate this project, giving a transparent characterization of the inductive biases of ideal learners. I will outline how probabilistic models are traditionally used to solve this problem, and then present a new approach that uses Markov chain Monte Carlo algorithms as the basis for an experimental method that magnifies the effects of inductive biases. This approach provides some surprising insights into how information changes through cultural transmission (relevant to understanding processes like language evolution) and shows how ideas from computer science and statistics can lead to new empirical paradigms for cognitive science research.
Bio: Tom Griffiths is an Associate Professor of Psychology and Cognitive Science at the University of California, Berkeley. His research explores mathematical models of higher level cognition, with the goal of understanding the formal principles that underlie our ability to solve the computational problems we face in everyday life. His current focus is on inductive problems, such as probabilistic reasoning, learning causal relationships, acquiring and using language, and inferring the structure of categories. He tries to analyze these aspects of human cognition by comparing human behavior to optimal or "rational" solutions to the underlying computational problems. For inductive problems, this usually means exploring how ideas from artificial intelligence, machine learning, and statistics (particularly Bayesian statistics) connect to human cognition.
Using matched employee-employer data to measure labor mobility and knowledge flows in supply-chain and labor-based industry clustersFriday, April 19, 2013 • 12:30PM–2PM • Lunch provided
Campus Center, Room 904-08
Abstract: The “Industry Cluster” framework has been the dominant paradigm guiding state economic and workforce development policy decisions over the past two decades. An industry cluster is a geographic concentration of interconnected businesses and associated institutions. It is widely believed that clustering results in higher productivity and faster rates of innovation, because co-located businesses are able to access to deeper pools of skilled workers and/or intermediate goods and service providers. They may also learn from the successes and failures of neighboring businesses. Thus policies designed to strengthen businesses in one industry may have beneficial ‘spillover’ effects on other industries. Underlying the industry cluster approach are methodological concerns over how to identify potentially symbiotic industries. Access to relevant data on potential inter-industry synergies is a major limiting factor. This study uses a rare and confidential database of matched employee-employer records from the State of Maine to explore how workers transfer knowledge and skills across employers and industries. Because workers are a primary vehicle of knowledge exchange, the analysis of labor mobility can help identify businesses with common skill requirements, production methods, or other technological foundations. Beyond the substantive aspect of the research, this presentation will also discuss the use of large and confidential government databases for academic research.
Bio: Dr. Renski is an Assistant Professor in the Department of Landscape Architecture and Regional Planning at the University of Massachusetts Amherst, the Director of the UMASS Center for Economic Development, and the Resident Methodologist of the UMASS Institute for Social Science Research. His research interests include regional influences on entrepreneurship, changing knowledge and skill requirement in the labor force, industrial cluster analysis, applied analytical methods, and state and local economic development policy. Prior to joining UMASS, Dr. Renski served as a Research Economist with the Maine State Planning Office and as the Deputy Program Manager of Maine’s North Star Alliance initiative.
Computational Social Science Poster SessionFriday, April 26, 2013 • 12:30PM–2PM • Lunch provided
LGRT Room 1634
Measuring Happiness, Health, and Social Stories, the Big Data WayTBD
Abstract: In this talk, I will report on a wide array of findings obtained through our real-time, remote-sensing, non-invasive, text-based `hedonometer'---an instrument for measuring positivity in written expression, soon to be housed online at hedonometer.org. I'll show how we have improved our methods to allow us to explore collective, dynamical patterns of happiness found in massive text corpora including the global social network Twitter, song lyrics, blogs, political speeches, and news sources. From the viewpoint of Twitter, I will report on global levels of temporal, spatial, demographic, and social variations in happiness and information levels, as well as evidence of emotional synchrony and contagion. I will also discuss how natural language appears to contain a striking frequency-independent positive bias, how this phenomenon plays a key role in our instrument's performance, and its connections with collective cooperation and evolution.
Bio: Peter Sheridan Dodds is an Associate Professor at the University of Vermont (UVM) working on system-level problems in many fields, ranging from sociology to physics. He is Director of the UVM's Complex Systems Center, co-Director of UVM's Computational Story Lab, and a visiting faculty fellow at the Vermont Advanced Computing Core. He maintains general research and teaching interests in complex systems and networks with a current focus on sociotechnical and psychological phenomena including collective emotional states, contagion, and stories. His methods encompass large-scale sociotechnical experiments, large-scale data collection and analysis, and the formulation, analysis, and simulation of theoretical models. Dodds's training is in theoretical physics, mathematics, and electrical engineering with formal postdoctoral and research experience in the social sciences. Dodds is currently funded by an NSF CAREER grant awarded by the Social and Economic Sciences Directorate.