The Knight First Amendment Institute at Columbia University sent a letter to Facebook on Monday urging it to change its terms of service (TOS) to allow journalists and researchers to automatically collect data from the platform and create accounts using pseudonyms or for fictional users. The “safe harbor” that the Knight Institute proposes for both techniques — “scraping” data and the use of “test” accounts — includes strong privacy protections for Facebook users while also protecting the use of these tools, which can be especially important for reporters and researchers investigating discriminatory practices among Facebook’s advertisers.
It’s crucial to understand, however, that the problem the Knight Institute’s proposal is trying to tackle extends beyond Facebook: Violating any website’s TOS can break an obscure federal anti-hacking law called the Computer Fraud and Abuse Act (CFAA). Because Facebook’s TOS bars scraping and enforces a strict “real name” policy that prohibits test accounts, the threat of facing a federal civil lawsuit or criminal referral under the CFAA deters journalists and researchers from using these tools on the platform.
Given that Facebook is the world’s largest social network and one of the biggest advertising sellers in history, the Knight Institute’s proposed safe harbor would dramatically improve matters for those pursuing public interest reporting and research on the platform. But, hopefully, it will also spark attention to and spur momentum behind comprehensive CFAA reform.
The history of the CFAA is both a fascinating example of pop culture driving tech policy and a cautionary tale in how well-intentioned mistakes in legislative drafting can have unintended consequences — consequences that are now behind the need for steps like the Knight Institute’s proposed safe harbor.
The story of the CFAA begins in Hollywood. In the early 1980s, two screenwriters were working on a script for a techno-thriller where a teenage computer whiz accidentally accesses a Pentagon supercomputer and almost starts a nuclear war.
Worried that the premise was simply too fantastic, that military mainframes would be too secure, they consulted Willis Ware, a pioneer in computer security. He said not to worry. In fact, a technique they wanted to use in the film — something called “war-dialing,” where a computer modem dials numbers randomly until it finds another computer — could absolutely reach a Pentagon computer. Why? Because engineers often worked weekends and wanted to be able to dial in from home.
Reassured, they finished the script, which was made into the 1983 film “WarGames.” The weekend it was released in theaters, President Reagan screened it at Camp David. Taken by the film, he interrupted a meeting with his joint chiefs to ask if the scenario was at all realistic. His advisors looked into it (talking to none other than Willis Ware) and recommended immediate action to secure government computers from hacking. Congress passed the CFAA to do that in 1986 (and the law has been expanded over the years to apply to virtually any computer connected to the internet).
The CFAA’s original sin was the failure to appropriately define what it means to access a computer “without authorization.” Most of us think of hacking as breaking through some technical security measure, either through trickery (like phishing) or by using a malicious computer program. Courts, however, have read “without authorization” much more broadly.
The law ballooned to cover an array of activity beyond what we would consider hacking. Courts have, for instance, found that the law covers password sharing and the nefarious use of a computer that was lawfully accessed. (To be clear, some of this activity — for instance, misusing a government computer to look up information about a romantic interest — could properly violate state or federal law, but the issue is whether it should be a violation of the federal hacking law.)
What’s crucial to understand with respect to the Knight Institute’s proposal is that courts have also found that, under the CFAA, scraping and the use of fake accounts constitutes accessing a website’s computers “without authorization” if those actions violate a website’s TOS.
The cases that interpret the CFAA so broadly tend to have troubling facts. In one, a computer security researcher discovered a flaw in AT&T’s registration system for first-generation iPads that made users’ emails publicly accessible. To prove it, the researcher used a computer program to collect thousands of iPad users’ email addresses.
Prosecutors were particularly incensed by the mass collection of email addresses, and that they had been turned over to the media (though the resulting Gawker story only included a few redacted emails of celebrities, and there’s a debate over whether what the researcher did was beyond the pale for the security research community). The researcher was criminally prosecuted and served more than a year in jail before the case was thrown out on appeal.
In another case, Lori Drew, the mother of a teenage girl, created a fake MySpace account that she used to bully a friend of her daughter who then committed suicide. Drew was criminally charged and convicted under the CFAA (although a district judge later overturned her conviction because her behavior, while deplorable, wasn’t criminal hacking).
In both cases, prosecutors were faced with outcry over troubling behavior but didn’t have a law that fit that behavior neatly, so they used the ambiguity in the CFAA’s definition of “without authorization” to shoehorn in a prosecution.
This expansion of the CFAA’s scope can have real, unintended consequences for journalists and researchers, for whom scraping or auditing using test accounts are important tools of the trade.
Take housing discrimination investigations, for instance. In 1988, Atlanta Journal-Constitution journalist Bill Dedman published “The Color of Money,” a landmark data journalism series that analyzed mortgage lending data from the government and financial institutions. Dedman showed that African-Americans in Atlanta were offered and approved for housing loans at vastly lower rates than similarly situated whites (indeed, even than poorer whites).
The series led directly to passage of the 1988 amendments to the Fair Housing Act that gave the Department of Housing and Urban Development the authority to investigate and punish housing discrimination. And how might a government agency or a journalist investigate housing discrimination? By using actors to pose as prospective tenants. In fact, it’s the most effective way to ferret out fair housing violations.
Were Dedman’s series to run today, The Atlanta Journal-Constitution would undoubtedly have thought about using scraping to build its lending datasets. It also likely would have considered using test accounts on Facebook or other websites to see if African-Americans were being advertised credit products on the same terms as whites. Right now, both of these techniques would violate Facebook’s TOS and potentially the CFAA.
The Knight Institute’s proposal is particularly well-timed. As the ongoing controversy over Russian influence operations on social media shows, the internet has the profound ability to amplify human failings. Further, it does so without the accountability that a First Amendment and enforceable transparency laws bring to the government. That breach must be filled by, among others, journalists and internet researchers, and they should be able to use all of the tools in the digital world that would be appropriate in the physical world. The CFAA’s current broad reach is an unnecessary obstacle to that work, and we support the Knight Institute’s initiative to address the problem.