In a digital age where traditional journalism is shrinking, journalists are spread thin. Local news organizations are closing without enough readership and finances to stay afloat; national publications are cutting jobs. The workload of journalists continues to increase to accommodate these changes, but there’s only so much one human can do.
In comes artificial intelligence. Alexander Spangher, a graduate student at the University of Southern California’s Information Sciences Institute, set out to reduce this manual burden in a four-pronged project that automates many of the more tedious aspects of journalistic work.
Former data scientist for the New York Times, Spangher saw an opportunity to use his computer science skills to aid reporters drowning under the influx of information that comes at them – while being under pressure to make the right choice. In short: what is newsworthy, and can AI weigh in?
“What we’re trying to do is take the really boring, mundane parts of the job and make them easier,” Spangher explained, “so then journalists can chase more stories, find more sources, use more sophisticated resources, promote their work on more platforms, and cover more relevant topics overall.”
Spangher was inspired by the imperative part journalism plays in supporting democracy, and the need to preserve it.
“If it weren’t for journalists telling us about our world, we wouldn’t understand our communities or know how to get involved. We wouldn’t be able to make informed votes,” Spangher stated. “There are so many crucial, important pieces of information that journalists uncover for us. I am continually really struck by the role of these journalists in liberal democracy. For instance, one study found that every $1 spent by a news organization benefited society $1,000. Another found local journalism to be the single biggest factor promoting good government.“
The first aspect of the solution he conceived involves training a model to determine the newsworthiness of a piece of information, using a newspaper. The model tries to predict if a piece of text will be front page news or end up in the middle or back.
“This ‘front page vs. not’ model covers such a small slice of the newsworthiness spectrum — there are so many other layers, like the stories that don’t get covered at all. So, we weren’t sure if it would do anything useful. But, we applied it to material that journalists use all the time in their reporting — like the minutes of a Los Angeles City Council meeting — and found that we were surfacing really interesting leads! Like, one about LA City Council enforcing pay equity between women’s and men’s soccer teams, which our algorithm said should have been at the top of the front-page, but hadn’t yet been covered,” Spangher explained. “There are all these important stories that journalists just didn’t have the time to find and cover.”
Spangher’s advisor, USC’s Research Associate Professor of Computer Science Jonathan May, says local news is being hit the hardest by the technological revolution, even though there is a substantial need for it right now.
“A big issue with local journalism is that it’s hard to keep it going even in the best of times,” May noted. Some areas are “news deserts” and just aren’t getting the coverage they used to, but with tools like these, “the limited bandwidth of a local beat reporter can be magnified.”
In regard to the L.A. City Council meeting notes, for example, the goal is “trying to take these public records and have risen to the top aspects of the meeting that look like they’re worth following up on,” May added.
The second part of the project focuses on who to interview after a story idea is in place, by equipping a model to “help journalists find sources that they wouldn’t have considered, maybe more diverse ones who can still meet their informational needs,” Spangher highlighted.
One of the most time-consuming tasks for a journalist is posting and promoting their story on social media platforms. The goal of the third installment of Spangher’s work is to develop a system that can tweak a journalist’s article to promote on various platforms such as Facebook, LinkedIn, or Twitter. The AI helper could formulate different versions of the article to perform the best and have the broadest reach possible.
“The question at the end is what is going to work on the different platforms based on the platform’s dynamics, how it runs, and what people pay attention to. All three are largely algorithmic,” Spangher explained. “So, it makes sense that an algorithm should be able to meet the requirements of another algorithm.”
The final piece of Spangher’s work is looking at how stories grow – or don’t – since they were published. He collected a large dataset of various articles and their versions with the intention of “tracing the arc of a story over time.” Pinpointing trends and tracking “which stories are going to continue to develop, and which ones will stop developing are important questions,” he emphasized. He received an Outstanding Paper award at NAACL 2022 for this work.
Spangher is adamant that despite the remarkable implications of these new tools, he believes AI will never replace the journalist. “The core of journalism is a fundamentally human process,” he said.
However, one of the most challenging aspects of this work, Spangher noted, was figuring out “how to think about tools that we think are going to be helpful without scaring journalists or making them feel like machine learning is taking their jobs.”
The research is still in the early stages, but Spangher’s ultimate goal is for these tools to be available in some form to news organizations and reporters everywhere.
“In maybe five or ten years, I hope to explore the possibility of creating a startup similar to a services platform that really brings together all of these tools, and helps journalists along the whole pipeline of story production,” said the Ph.D student.
Published on November 28th, 2022
Last updated on November 28th, 2022