News

This project—the first in a suite of tools for consumers of news on social media—will build an open database of popular news sources on Facebook, illustrating their reach across platforms, surfacing data about the owners, advertising networks, authors, and affiliations. This will take the form of a user-friendly public website, as well as an API so other developers can build tools that use this database to illuminate the murky world of partisan news on social media. This project aims to empower the public to be more responsible about the news they share with their networks, as well as increase media literacy around online news sources.

Project lead: John Keegan

Launched in 2018, the Digital Forensics Initiative is part of the Tow Center for Digital Journalism at Columbia University. Led by Dr. Jonathan Albright, this initiative will explore the complex interface between news, data, and impacts through applied forensics, data reporting, network analysis, and observational and situational study. The intention is for the program to act as a resource for scholars, practitioners, and policymakers by providing timely scholarly analysis, evidence-based insights, data resources, and critical commentary through applied media analytics.

The goal of this initiative is to promote better understanding for improved reporting, research, and policy decisions. This involves in-depth investigation of the mechanisms affecting the reception, retention, and impact of information and topics in the news through a multidisciplinary, mixed methods, and computational social science approach. While the focus of the Digital Forensics Initiative involves the analysis and collection of data related to political messaging and digital campaigning, important themes and questions we seek to study include:

- What are the pathways that tend to increase and/or decrease the salience and resultant impact (i.e., behaviors and attitudes) of information online?

- What are the relative shapes of the networked information spheres and relationship structures of actors involved in politics, healthcare, science, business, entertainment, and popular culture?

- What is the relationship between bias and impact in the dissemination of information between different sources and platforms? Where do comparative differences exist, and do they matter?

- Where does the least reliable (and most biased) information tend to circulate the most? What are the key mechanisms and distribution vectors through which this process occurs?

- What are the themes in the amplification tactics involving information related to issue controversies, political debates, social activism, and other topics of high salience?

- What roles do platforms and legacy media play in the spread of unreliable information? What are the roles of professional journalists, institutional media, and independent media?

- Where (and how) do automation and inauthentic participation affect the pathways through which people are exposed to and/or are more likely to encounter certain types of information?

- What roles do content personalization, ideological preferences, geographic- and demographic-based targeting, and preferred delivery channel(s) play in the types of information people encounter?

Project lead: Jonathan Albright

This project uses a novel sensor technology to tell the story of several weeks in New York City from the perspective of rats.

Project leads: Jason Fields, Marguerite Holloway, Brian House

A partnership between faculty and students in the Departments of History, Statistics and Computer Science at Columbia University, this project examined official secrecy by applying natural language processing software to archives of declassified documents to examine whether it is possible to predict the contents of redacted text, attribute authorship to anonymous documents, and model the geographic and temporal patterns of diplomatic communications.

Project leads: Nicholas Diakopoulos, Alexander Howard, Jonathan Stray

A new web dashboard that allows journalists to analyze, visualize, and interact with contractor data from governments.

Project lead: Alexandre Goncalves

Advances in artificial intelligence (AI) are influencing both the news industry and individual news consumption behaviors. The ability to convert structured data into a captivating story, indistinguishable from human-authored content, has large implications on the genesis and dynamics of audience segmentation. This project argues that audience fragmentation—accelerated by artificial intelligence—will be qualitatively different from that driven by either the multiplication of channels or on-demand personalized news consumption.

The purpose of the project is threefold: (1) segment today’s news audiences based on their current awareness, understanding and attitudes toward revolutionary changes that are being made in news production and distribution driven by artificial intelligence; (2) examine the audiences’ engagement with news and content powered by AI and automated journalism based on their current uses and gratifications; (3) identify the potentials and limits of AI-powered news and content to provide recommendations for ethically and efficiently incorporating AI technologies and requirements into the news ecosystem in a manner that best serves journalism and its audiences.

With these goals in mind, the current research examines how news audiences are segmented based on the beliefs held, the behaviors enacted, and the constraints faced concerning changes that are being made in news production and distribution powered by artificial intelligence and/or automated journalism.

This project will conduct two rounds of an online survey with adults in the United States. To segment news audiences, the survey data will be analyzed using latent cluster analysis (LCA), a statistical method for identifying unobserved subgroups within populations based on observed indicators. Unlike typical audience segmentation that is ad hoc and crude, the social scientific approach seeks to identify predictable groups based on the empirical observations that appear to be similar across a number of variables and subsequently develop an understanding of the underlying structure in terms of characteristics.

Project lead: Joon Soo Lim

Americans’ trust and confidence in the mass media is at an all-time low. Compounding this effect is the increase in “fake news.” It is becoming increasingly difficult for media consumers to distinguish between truthful and fictional news stories; readers of fake news reports often believe them, while readers of accurate reports question and mistrust them. A free press is a critical component of democracy, yet it seems to be in danger in the current age of media mistrust. To combat this trend, there have been recent efforts in the Natural Language Processing (NLP) community to use machine learning to automatically distinguish between “real” and “fake” news. This work is very important and will hopefully equip media consumers with the necessary tools to navigate the murky world of truth in media.

This project aims to study a complementary problem to fake news detection: trusted news detection. Instead of focusing on determining what is true or fictional, this study aims to discover the characteristics of trusted or believed text, regardless of the veracity of the text in question. Trust in media has been previously studied qualitatively: this proposed research is to our knowledge the first effort to quantitatively study trust in media on a large scale, using automated crowdsourcing, machine learning, and natural language processing methods. Further, this project proposes to analyze group-specific indicators of trust, to discover whether perception of trustworthiness varies across different categories of media consumers.

Project leads: Julia Hirschberg, Sarah Ita

Over the past decade, data has had an immense impact on the world. It has overhauled the technology industry, made immense shifts in the business and medical sectors, and even altered the way government functions. The media industry is no exception. This paper hopes to explore how the immense growth of data has affected reporters’ legal risks and legal rights. More specifically, it will examine how data has changed the legal risks for reporters’ newsgathering. For example, in the past decade there has been an increased risk of prosecution of journalists under the Computer Fraud and Abuse Act. The paper will also investigate how the data society has affected leaks and leak reporting. It will examine how the data boom has swelled the Freedom of Information Act (FOIA) process—and instigated the government to monetize its own data—and make journalists pay for information that should be free under the federal statute. Lastly, it will delve into the entangled relationship between robotics, AI, data, and the newsroom. In essence, this project hopes to understand how journalism responds to the growing changes in the information society.

Project lead: Victoria Baranetsky

Algorithms are playing a growing role in determining which news stories reach audiences as they increasingly assist or replace humans in the distribution and curation of news. This development begs the question: What kind of information landscapes are algorithmic platforms creating for individuals and communities as they direct millions of people to news via recommendation engines and search interfaces? Indeed, such engines and interfaces are currently among the main sources of traffic to news sites, making them crucial objects for study.

This study will compare thousands of real-world news searches conducted by a large and diverse set of participants across different digital services (e.g., Google, YouTube and Facebook) in order to gain insight into patterns of news distribution on the most popular algorithmically driven gatekeeping platforms. It will examine if news search algorithms promote filter bubbles and fragmented audiences as feared by some scholars and in popular media—or, alternatively, if they perhaps construct relatively homogeneous and uniform news landscapes online.

The authors' previous study with the Tow Center indicated that, on Google News, people of different political leanings and backgrounds were recommended highly similar news diets regarding the 2016 U.S. presidential election, sourced primarily from a small number of mainstream national outlets. This challenged the assumption that algorithms invariably encourage echo chambers while disrupting power structures within media industries. This follow-up project expands these research questions into a broader set of platforms and topics on the news, paying special attention to the role that local news sources are assigned in these environments.

Project leads: Seth C. Lewis, Efrat Nechushtai, Rodrigo Zamith

In 2016, the Illuminating project was supported by a Tow Fellowship, and was able to build an interactive website targeted to journalists, academics, and the public that provided real-time analysis of the U.S. presidential candidates’ Twitter and Facebook accounts. The Illuminating website (http://illuminating.ischool.syr.edu) generated news coverage and spawned academic conference presentations and journal articles. The Illuminating project would like to expand its work to encompass the 2018 midterm elections. Midterms pose great challenges for newsrooms. Cuts in political reporters (especially state and local reporters), a complex, multi-campaign context, and the ever-expanding communication environment challenge the abilities of newsrooms to provide comprehensive coverage of political campaigns and detect shifting public opinion. The Illuminating project's goal is to expand its website and analysis to the gubernatorial, House, and Senate races, to add topic analysis to its existing categories, and to further improve its analysis of the public’s discussion around the campaigns. Continuing its effort to support journalists, Illuminating aims to automate article briefs based on real-time analysis of social media messages so as to allow journalists easy understanding of key insights so they can report on them. The project is also working on a partnership with the Associated Press to observe newsroom practices and share leads for potential stories.

Project lead: Jenny Stromer-Galley

A systematic analysis of the challenges facing managers of modern news organizations in order to provide publishers and managing editors with specific recommendations regarding recruiting strategies, target skills, and educational backgrounds that will complement existing newsroom workforces.

Project leads: Allie Kosterich, Matthew Weber

This report lays out a framework for thinking about reader revenue strategy in nonprofit news organizations.

Project lead: Elizabeth Hansen

Team member: Emily Goligoski

The last 12 to 18 months have seen an explosion of new businesses created on blockchain technology—the underlying technology behind Bitcoin which is rapidly expanding to support new businesses with a wide range of functions in a wide variety of markets. Funded with “ICOs” (Initial Coin Offerings), these businesses form by creating new cryptocurrencies and selling them to a large and diverse group of buyers, who then become stakeholders in the resulting company; the currency is then used to power the activity of the company, increasing the value of the company and distributing the rewards for that value increase to the users, who in effect are the shareholders.

The new, “decentralized” web has attracted increasing attention as ICOs have pushed into the nine-digit stratosphere in initial raises, creating companies with pretty massive spending power right out of the gate.

At least three such companies are presently operational or in formation with the idea of developing a media economy on the blockchain that will support journalism with an entirely new, decentralized business model. One of these three—Civil—is a project the author is working on directly; two others—Steemit and DNN (Decentralized News Network)—are pursuing similar goals, differently.

“Journalism on the Blockchain” will explore the possibilities for creating new economic models for journalism on the blockchain. In the process, it will document the challenges and successes of the people who are trying to execute on this idea right now, assess where the space is heading, and what the major challenges and proven successes are and have been.

Project lead: Bernat Ivancsics

Over the past few years, a growing number of journalism stakeholders and researchers have argued that newsrooms should make “audience engagement” one of their chief pursuits. The term is increasingly portrayed as a cure-all for the news industry’s ails – audience engagement will increase audience loyalty, build audience trust, and make journalists’ work more relevant. Those who hope to make audience engagement both normative and measurable face enormous barriers to success. They need to convince news industry stakeholders, each with their own interests and opinions, to rally around a novel interpretation of journalistic practice. They also need to settle an internal debate surrounding how audience engagement itself should be defined and evaluated. Because the term currently lacks an agreed upon meaning—let alone metric—it has become an object of contestation. The efforts to make audience engagement central to news production therefore present an opportunity to learn how journalism is changing, and who within the field has the power to change it.

This project will investigate these efforts by drawing on interviews with and observations of news publishers, foundations, and audience analytics firms that are currently playing key roles in the ongoing conversation surrounding audience engagement. The pursuit of a shared definition of audience engagement is a question of agency: how much power do its advocates have to change what news production looks like? And how powerful are the structures obstructing their efforts? This project will attempt to answer these questions. In doing so, it will reveal how and why news industry stakeholders are attempting to change what form journalism takes, how its produced, and where the audience fits into that process.

Project lead: Jacob L. Nelson

Algorithms are playing a growing role in determining which news stories reach audiences as they increasingly assist or replace humans in the distribution and curation of news. This development begs the question: What kind of information landscapes are algorithmic platforms creating for individuals and communities as they direct millions of people to news via recommendation engines and search interfaces? Indeed, such engines and interfaces are currently among the main sources of traffic to news sites, making them crucial objects for study.

This study will compare thousands of real-world news searches conducted by a large and diverse set of participants across different digital services (e.g., Google, YouTube and Facebook) in order to gain insight into patterns of news distribution on the most popular algorithmically driven gatekeeping platforms. It will examine if news search algorithms promote filter bubbles and fragmented audiences as feared by some scholars and in popular media—or, alternatively, if they perhaps construct relatively homogeneous and uniform news landscapes online.

The authors' previous study with the Tow Center indicated that, on Google News, people of different political leanings and backgrounds were recommended highly similar news diets regarding the 2016 U.S. presidential election, sourced primarily from a small number of mainstream national outlets. This challenged the assumption that algorithms invariably encourage echo chambers while disrupting power structures within media industries. This follow-up project expands these research questions into a broader set of platforms and topics on the news, paying special attention to the role that local news sources are assigned in these environments.

Project leads: Seth C. Lewis, Efrat Nechushtai, Rodrigo Zamith