Privacy, Big Data, and the Internet

Setting the stage for this week’s Columbia SIPA conference on Internet Governance and Cybersecurity

Since last fall, I’ve been a (very) part-time senior fellow in Internet governance and cybersecurity at Columbia University’s School of International and Public Affairs. It’s a remarkable place, stacked with first-rate scholars and practitioners, home to energetic and engaged students from across the globe, and an emerging center of gravity for cross-disciplinary work on international technology policy issues. I’ve been able to spend time with faculty and researchers from across Columbia, like Matthew Waxman and Tim Wu at the Law School, Steve Bellovin at the Department of Computer Science, Eli Noam at the Business School and the Institute for Tele-Information, Agnès Callamard of the Project on Global Freedom of Expression and Information, and Emily Bell of the Journalism School and the Tow Center, as well as visiting scholars like Herbert Lin and Martin Varsavsky. I was hugely impressed by the range and depth of tech-revevant work by Columbia faculty and graduate students, most notably at the Data on a Mission Summit organized by the Data Science Institute, at which I delivered a closing keynote on ethical and policy issues triggered by big data, cheap computing power, service abstraction, and global, cross-border connectivity.

Panel and Panelists

On Thursday, I’ll moderating a session on “Privacy, Big Data, and the Internet.” Joining me for the discussion will be:

  • Balances vs. Trade-offs: Can privacy effectively be protected by a government without stifling innovation or killing the potential of Big Data to power improvements in health care, scientific research, transportation, environmental protection, and so forth?
  • Public vs. Private: Is there a meaningful distinction between government and corporate data sets? As they operate across borders, what are the responsibilities of corporations to protect privacy?
  • Equity vs. Neutrality: Are Big Data and algorithmic decision-making in areas like insurance, banking, hiring, evaluation, admissions, and criminal sentencing producing disparate or adverse impacts on the poor, the marginalized, the disadvantaged?

Backgrounder on Privacy, Big Data, and the Internet

What do we mean by “Big Data”?

Though somewhat fuzzy — much like “cloud computing” — “big data” typically refers to large-scale, connected database infrastructures that are (1) big in terms of the number of data points collected, and/or (2) big in terms of the analytics that can be performed.

The Conventional Frame Around Big Data and Privacy

Looking around the landscape of privacy debates circa 2015, we can discern a conventional — and, I will argue, tired — frame that positions big data privacy policy as a choice between restrictions on the collection of data and restrictions on the use of data.

Specific Concerns

In this conventional framing of current privacy debates, we can divide big data into that which is “born digital,” like email messages, and that which is “born analog”, like the recordings of security and traffic cameras, microphones, automobile GPS devices, and other sensors.

  • Data fusion, meaning the combination of disparate data sources into profiles and tracking records that, thanks to the analytic power of big data, enable the identification of specific people with specific activities over time.

History and The Shift

Of course, the debate over collection vs. use restrictions isn’t new. Dating back to the 1960s, privacy laws — even in the U.S. — attempted to place limits on the collection AND use of data: the Fair Credit Reporting Act of 1968, the Educational Privacy Act of 1974, the HIPAA Privacy Rule, the Federal Trade Commission’s Fair Information Practice Principles.


The debate over big data and privacy in the Internet era implicates a host of related policy issues, the inter-relationships among which I’ll be provoking the panel to address. They include:

  • State and non-state surveillance, including efforts to amend or reform Title III of the Wiretap Act, the Foreign Intelligence Surveillance Act, and the USA-PATRIOT Act,
  • Conflicts among rules and interests across jurisdictions,
  • The relative obligations of governments and corporations in an environment in which virtually all of the communications infrastructure in question is privately owned and operated.

Partner/co-founder: Higher Ground Labs. Venture fellow: betaworks. Board: Access Now, Public Knowledge. Nerd, really. <>

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store