Why Journalists Need to Link

Jonathan Stray has a great essay up at Nieman Lab entitled “Why link out? Four journalistic purposes of the noble hyperlink”. I basically agree with all of it; links are wonderful things, and the more of them that we see in news stories — especially if they’re external rather than internal links — the better.

It’s very easy to agree that if a story refers to some other story or document, and if that other story or document is online, then it should be hyperlinked. But Stray goes further than that:

In theory, every statement in news writing needs to be attributed. “According to documents” or “as reported by” may have been as far as print could go, but that’s not good enough when the sources are online.
I can’t see any reason why readers shouldn’t demand, and journalists shouldn’t supply, links to all online resources used in writing a story.

Tellingly, Stray provides no hyperlinks at all for his assertion that “every statement in news writing needs to be attributed”. Is this really true? It certainly isn’t in the UK, where I come from. What’s more, even before the WSJ got taken over by foreign marauders like Rupert Murdoch and Robert Thomson, it followed this rule mostly just by inserting the stock phrase “according to people familiar with the situation” into any story. That phrase, of course, tells the reader exactly nothing.

In recent days, a debate has emerged online on what I consider to be two very different subjects, which are getting unhelpfully elided. The first question, raised by MG Siegler, is whether outlets like the WSJ have an obligation to say who first broke a piece of news, when they report that news. The second question, which is often mistaken for the first, is whether outlets like the WSJ should link to outside sources of information.

To the second question, my answer is simple: yes. But look at the story by Jessica Vascellaro about Apple acquiring Chomp. There’s only one part of that story which obviously needs a hyperlink, if such a thing were available, and that’s in the first sentence, where we’re told that Apple said it has acquired Chomp. If there’s some kind of public press release from Apple saying such a thing, then the WSJ should link to it. But there isn’t, so the lack of any link there is forgivable.

What Siegler wants is for extra text to be added in to Vascellaro’s story, saying that he first broke the news. And I’m pretty sure that Stray would want the same thing — after all, Vascellaro’s own tweet does imply that she first got wind of the story online, before confirming it with Apple. If it was Siegler’s article which caused Vascellaro to call Apple, then Siegler certainly counts as an online resource used in writing the WSJ story, and should therefore, by Stray’s formulation, be fully linked and credited.

On the other hand, if Stray agrees with Siegler, that doesn’t mean that Siegler agrees with Stray. Siegler cited no source at all, named or anonymous, for his scoop that Apple had bought Chomp: he simply asserted the fact. “Apple has bought the app search and discovery platform Chomp, we’ve learned.” If every statement in news writing needs to be attributed, then Siegler just failed that test.

But I don’t think it does. If you attribute a statement like that to “sources familiar with the situation”, or something along those lines, then the attribution looks a lot like a CYA move. Consider the difference between (a) “Apple has bought Chomp”, and (b) “Apple has bought Chomp, say sources familiar with the situation”. Technically speaking, if the sale falls through, then (a) is false, while (b) was actually true. In that sense, failing to provide attribution is a way of sticking your neck out and asserting news to be a fact. Here’s Siegler:

I reported the Apple acquisition of Chomp as a fact for good reason — It. Was. A. Fact. If I had reason to believe it may not be a done deal or not 100% certain, I would have said that. I did not because I didn’t need to.

Not too long ago, I had a conversation with a journalist who was adamantly sticking up for her story in the face of criticism. The story included a statement of the form “X, says Y”, where Y was an anonymous source. Various other people were saying that X was not, in fact, true. But the journalist was standing firm. I then asked her whether she was standing firm on the statement “X, says Y”, which she reported — or whether she was standing firm on the statement that X. And here’s the thing that struck me: it took her a long time to even understand the distinction. A lot of American journalists stick the sourcing in there because they have to — but they very much consider themselves to be reporting news, and if X turned out not to be true, they would never consider their story to be correct, even if it were true that Y had indeed said that X.

Elsewhere, however, those conventions don’t hold. In a lot of political reporting, you have one person saying “X”, and another person saying “not-X”, and it’s left to the reader to decide whether one or the other or neither is telling the truth. And even facts can end up being attributed to people, which is even more confusing. Consider this, for instance, from a recent NYT article by Motoko Rich:

The home ownership rate has been falling from its peak of 69.4 percent in 2004, according to census data. By the fourth quarter of 2011, it was down to 66 percent. That means about two million more households are renting, said Kenneth Rosen, an economist and professor of real estate at the Haas School of Business at the University of California, Berkeley.

This is Rosen’s only appearance in the article, and he’s not being used to give an opinion, or an expert analysis: he’s being used to count rental households. And, at least on the face of things, he’s not particularly good at that. According to the 2010 census summary, there are 116,716,292 occupied housing units in America. So a basic back-of-the-envelope calculation would say that if the proportion of those units which went from owner-occupied to rented moved from 69.4% to 66%, then the increase in rental households would be 3.4% of 116,716,292, which comes to almost exactly 4 million. That’s double Rosen’s number.

Or, we can get more accurate, and go back to the 2005 American Community Survey, which showed 36,771,635 renter-occupied housing units in total. Contrast that with 2010, where there were 40,730,218 renter-occupied housing units. The difference, again, is almost exactly 4 million.

Most accurately of all, you can look directly at the Census Bureau’s quarterly estimates of the US housing inventory. According to that series, the number of renter-occupied houses in the US was 32,913,000 in the second quarter of 2004; it’s now 38,771,000. The difference there is not 2 million or 4 million but rather 5.9 million. (In the same time, the number of owner-occupied households has increased by 1.2 million.)

Now Rosen may or may not have good reason to believe that in fact the real increase in renting households is only 2 million rather than 4 million or 6 million. But if he does, that reason is not the drop in the homeownership rate from 69.4% to 66%. Not given the number of households in this country. (The homeownership data is here, by the way; it’s worth noting that Rich didn’t link to it.)

All of which housing wonkery is to say that even basic facts like the increase in US rental households can be non-trivial to pin down, and that both Rich and her readers would probably have been better off if she hadn’t bothered phoning Rosen at all, and had just got her numbers for the increase in rental households directly from the people measuring such things. Citing sources doesn’t help the reader at all, here: if Rich had been forced to assert the increase in rental households, rather than simply attributing the number to Rosen, then she would probably have got something closer to the truth.

The difference between linking and citing is the difference between showing and telling. I’m not a big fan of citing, mainly because it gets in the way: we might learn a lot about where the Haas School of Business might be, but at the same time we’ll learn nothing useful about the increase in the number of rental households. On the other hand, if Rich had simply said that “about six million more households are renting”, complete with hyperlink, that would have been shorter, more useful, and more accurate, even if there were no explicit citation.

Similarly, there’s a case to be made that Vascellaro could and should simply have put out a one-line story under the exact same headline (“Apple Acquires App-Search Engine Chomp”), saying “I’ve talked to Apple and they confirm this story is true.” Vascellaro had exactly one new piece of information: Apple’s confirmation of the news. In a world where TechCrunch is only a click away, why write out a lazy rehash of what Siegler had already written, rather than just linking to his story and moving on to breaking and writing something more interesting?

One reason is that the WSJ still has a hugely successful print product, and that therefore WSJ journalists’ pieces need to work in print as well as online. What’s more, as people increasingly read WSJ.com stories offline, on things like the WSJ iPad app, the need for those stories to be reasonably comprehensive remains. Even in the age of the hyperlink. Here’s Stray:

Rewriting is required for print, where copyright prevents direct use of someone else’s words. Online, no such waste is necessary: A link is a magnificently efficient way for a journalist to pass a good story to the audience.

The problem is that a journalist never really knows whether their work is going to be read online or offline, even if they’re writing solely for the web. The story might get downloaded into an RSS reader, to be consumed offline. It might be emailed to someone with a Blackberry who can’t possibly be expected to open a hyperlink in a web browser. It might even get printed out and read that way.

Besides, the simple fact is that even if people can follow links, most of the time they don’t. An art of writing online is to link to everything, but to still make your piece self-contained enough that it makes sense even if your reader clicks on no links at all. Cryptic sentences which make no sense until you click on them are arch and annoying.

What’s more, as Stray says, “online writing needs to be shorter, sharper, and snappier than print”; his link will take you to Michael Kinsley, moaning about how “newspaper stories are written to accommodate readers who have just emerged from a coma or a coal mine”. In that context, does it really behoove reporters to build a long list of sources into all of their stories? Does every news story need to link to the organization which first broke the news? Does every journalist need to hat-tip the friend of theirs who retweeted the nugget which ultimately resulted in their story?

My feeling is that commodity news is a commodity: facts are in the public domain, and don’t belong to anybody. If you’re mentioning a fact which you sourced in a certain place, then it’s a great idea to link to that place. And if you’re matching a story which some other news organization got first, it’s friendly and polite to mention that fact in your piece, while linking to their story. But it’s always your reader who should be top of mind — and the fact is that readers almost never care who got the scoop.

There’s one big exception to that rule, however. Often, a reporter spends a long time getting a big and important scoop, which comes in the form of a long and deeply-reported story. When other news organizations cover that news, they really do have to link to the original story — the place which did it best. Otherwise, they shortchange their readers. A prime example came last August, with Matt Taibbi’s 5,000-word exposé of the SEC’s document-shredding. Anybody covering that story without linking to Taibbi was doing their readers a disservice.

As a result, like most things online, it’s very dangerous to try to come up with hard-and-fast rules about such things. In general, it’s good to link to as many different people and sources as possible, because the more links you have, the richer your story is. On the other hand, the journalistic web is full of garbage hyperlinks — automated links to irrelevant topic pages, for instance, or links to an organization’s home page when that organization is first mentioned.

As for crediting the news organization which broke some piece of news, that’s more of a journalistic convention than a necessary service to readers. It’s important enough within the journalism world, at least in the US, that it’s probably a good idea to do it when you can. But most of the time it’s pretty inside-baseball stuff. And in the pantheon of journalistic sins, failing to do it is not a particularly big deal. What’s much more important is that your reader get as much information as possible, as efficiently as possible. Which means that if you’re writing about a document or report, you link to that document or report. Failure to do that is a much greater sin than failure to link to some other journalist.

So while sometimes the failure to link is unavoidable, I look forward to a time when journalists face much more criticism for not linking to primary documents than they do for not linking to some other news organization which got the news first.

Has America ever needed a media watchdog more than now? Help us by joining CJR today.

Felix Salmon is an Audit contributor. He's also the finance blogger for Reuters; this post can also be found at Reuters.com. Tags: Linking, Nieman Journalism Lab, Reuters, The New York Times, The Wall Street Journal

Featured

Journalism is now the second draft of history

By James Harkin

The newspaper that #MeToo missed

By Jennifer Robison

Palestinian citizens of Israel struggle to tell their stories

By Miriam Berger

What a report from Germany teaches us about investigating algorithms

By Nicholas Diakopoulos