Homeopapes: journalism by machine

Here’s an interesting piece at Wired UK that picks up the “OMG journalism is dying” ball and runs with it in the direction of automated machine-to-machine and machine-to-person news aggregation:

NewsScope is a machine-readable news service designed for financial institutions that make their money from automated, event-driven, trading. Triggered by signals detected by algorithms within vast mountains of real-time data, trading of this kind now accounts for a significant proportion of turnover in the world’s financial centres.

Reuters’ algorithms parse news stories. Then they assign “sentiment scores” to words and phrases. The company argues that its systems are able to do this “faster and more consistently than human operators”.

Millisecond by millisecond, the aim is to calculate “prevailing sentiment” surrounding specific companies, sectors, indices and markets. Untouched by human hand, these measurements of sentiment feed into the pools of raw data that trigger trading strategies.


Here and there, interesting possibilities are emerging. Earlier this year, at Northwestern University in the US, a group of computer science and journalism students rigged up a programme called Stats Monkey that uses statistical data to generate news reports on baseball matches.

Stats Monkey relies upon two key metrics: Game Score (which allows a computer to figure out which team members are influencing the action most significantly) and Win Probability (which analyses the state of a game at any particular moment, and calculates which side is likely to win).

Combining the two, Stats Monkey identifies the players who change the course of games, alongside specific turning points in the action. The rest of the process involves on-the-fly assembly of templated “narrative arcs” to describe the action in a format recognisable as a news story.

The resulting news stories read surprisingly well. If we assume that the underlying data is accurate, there’s little to prevent newspapers from using similar techniques to report a wide range of sporting events.

The first knee-jerk question here is “can (or should) we trust those algorithms to remain uncorrupted? How easy would it be for such a system to create news that wasn’t true, or that spun the truth in a particular direction?”

The instant counterargument would be to ask how much more prone to corruption and error an automated system would be compared to the existing human-based systems… all trust needs to be earned, after all, and (speaking for myself) I’ve little trust in the worldview of any media outlet when viewed in isolation. I aggregate my incoming news already through a bunch of semi-manual processes and routines; would something that removes the drudgery of that be inherently bad, or does the risk lie in our laziness and subconscious gravitation toward echo-chambers of our own ideas? Is there any such thing as objective news (at least about anything that really matters, a category which I feel sports doesn’t really occupy)?

All this talk of truth, trust and objective realities puts me in mind of Philip K Dick – more specifically “If There Were No Benny Cemoli”, with its homeopapes churning out news of a planetary adversary who may or may not actually exist. Can anyone recommend more stories that deal with similar themes?