“The public doesn’t understand,” she told me, speaking about so-called metadata. “It’s much more intrusive than content.” She explained that the government can learn immense amounts of proprietary information by studying “who you call, and who they call. If you can track that, you know exactly what is happening—you don’t need the content.”
For example, she said, in the world of business, a pattern of phone calls from key executives can reveal impending corporate takeovers. Personal phone calls can also reveal sensitive medical information: “You can see a call to a gynecologist, and then a call to an oncologist, and then a call to close family members.” And information from cell-phone towers can reveal the caller’s location. Metadata, she pointed out, can be so revelatory about whom reporters talk to in order to get sensitive stories that it can make more traditional tools in leak investigations, like search warrants and subpoenas, look quaint. “You can see the sources,” she said. When the F.B.I. obtains such records from news agencies, the Attorney General is required to sign off on each invasion of privacy. When the N.S.A. sweeps up millions of records a minute, it’s unclear if any such brakes are applied.
Metadata, Landau noted, can also reveal sensitive political information, showing, for instance, if opposition leaders are meeting, who is involved, where they gather, and for how long. Such data can reveal, too, who is romantically involved with whom, by tracking the locations of cell phones at night.
For the law-enforcement community, particularly the parts focussed on locating terrorists, metadata has led to breakthroughs. Khalid Sheikh Mohammed, the master planner of the September 11, 2001, attacks on New York and Washington, “got picked up by his cell phone,” Landau said. Many other criminal suspects have given themselves away through their metadata trails. In fact, Landau told me, metadata and other new surveillance tools have helped cut the average amount of time it takes the U.S. Marshals to capture a fugitive from forty-two days to two.
A humorous illustration of the power metadata is Kieran Healy’s parody of “Finding Paul Revere“.
I have been asked by my superiors to give a brief demonstration of the surprising effectiveness of even the simplest techniques of the new-fangled Social Networke Analysis in the pursuit of those who would seek to undermine the liberty enjoyed by His Majesty’s subjects. This is in connection with the discussion of the role of “metadata” in certain recent events and the assurances of various respectable parties that the government was merely “sifting through this so-called metadata” and that the “information acquired does not include the content of any communications”. I will show how we can use this “metadata” to find key persons involved in terrorist groups operating within the Colonies at the present time. I shall also endeavour to show how these methods work in what might be called a relational manner.
The analysis in this report is based on information gathered by our field agent Mr David Hackett Fischer and published in an Appendix to his lengthy report to the government. As you may will be aware, Mr Fischer is an expert and respected field Agent with a broad and deep knowledge of the colonies. I, on the other hand, have made my way from Ireland with just a little quantitative training—I placed several hundred rungs below the Senior Wrangler during my time at Cambridge—and I am presently employed as a junior analytical scribe at ye olde National Security Administration. Sorry, I mean the Royal Security Administration. And I should emphasize again that I know nothing of current affairs in the colonies. However, our current Eighteenth Century beta of PRISM has been used to collect and analyze information on more than two hundred and sixty persons (of varying degrees of suspicion) belonging variously to seven different organizations in the Boston area.
Rest assured that we only collected metadata on these people, and no actual conversations were recorded or meetings transcribed. All I know is whether someone was a member of an organization or not. Surely this is but a small encroachment on the freedom of the Crown’s subjects. I have been asked, on the basis of this poor information, to present some names for our field agents in the Colonies to work with. It seems an unlikely task.
Notice again, I beg you, what we did there. We did not start with a “social networke” as you might ordinarily think of it, where individuals are connected to other individuals. We started with a list of memberships in various organizations. But now suddenly we do have a social network of individuals, where a tie in the network is defined by co-membership in an organization. This is a powerful trick.
What a nice picture! The analytical engine has arranged everyone neatly, picking out clusters of individuals and also showing both peripheral individuals and—more intriguingly—people who seem to bridge various groups in ways that might perhaps be relevant to national security. Look at that person right in the middle there. Zoom in if you wish. He seems to bridge several groups in an unusual (though perhaps not unique) way. His name is Paul Revere.
Once again, I remind you that I know nothing of Mr Revere, or his conversations, or his habits or beliefs, his writings (if he has any) or his personal life. All I know is this bit of metadata, based on membership in some organizations. And yet my analytical engine, on the basis of absolutely the most elementary of operations in Social Networke Analysis, seems to have picked him out of our 254 names as being of unusual interest. We do not have to stop here, with just a picture. Now that we have used our simple “Person by Event” table to generate a “Person by Person” matrix, we can do things like calculate centrality scores, or figure out whether there are cliques, or investigate other patterns. For example, we could calculate a betweenness centrality measure for everyone in our matrix, which is roughly the number of “shortest paths” between any two people in our network that pass through the person of interest. It is a way of asking “If I have to get from person a to person z, how likely is it that the quickest way is through person x?”
Perhaps I should not say “terrorists” so rashly. But you can see how tempting it is. Anyway, look—there he is again, this Mr Revere! Very interesting. There are fancier ways to measure importance in a network besides this one. There is something called eigenvector centrality, which my friends in Natural Philosophy tell me is a bit of mathematics unlikely ever to have any practical application in the wider world. You can think of it as a measure of centrality weighted by one’s connection to other central people.
I admit that, in addition to the possibilities for finding something interesting, there may also be the prospect of discovering suggestive but ultimately incorrect or misleading patterns. But I feel this problem would surely be greatly ameliorated by more and better metadata. At the present time, alas, the technology required to automatically collect the required information is beyond our capacity. But I say again, if a mere scribe such as I—one who knows nearly nothing—can use the very simplest of these methods to pick the name of a traitor like Paul Revere from those of two hundred and fifty four other men, using nothing but a list of memberships and a portable calculating engine, then just think what weapons we might wield in the defense of liberty one or two centuries from now.
One NSA analyst’s terrorist could be his boss’s freedom fighter, but we are all walking, breathing traitors to those around us who betray obliviously – indeed we would probably hand over any information if we could profit by it – the data mines around us.