• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • Core Principles of Responsible AI
    • Accountability
    • Fairness
    • Privacy
    • Safety and Security
    • Sustainability
    • Transparency
  • Special Topics
    • AI in Industry
    • Ethical Implications
    • Human-Centered Design
    • Regulatory Landscape
    • Technical Methods
  • Living Dictionary
  • State of AI Ethics
  • AI Ethics Brief
  • 🇫🇷
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

Privacy Limitations Of Interest-based Advertising On The Web: A Post-mortem Empirical Analysis Of Google’s FLoC

February 21, 2022

🔬 Research Summary by Alex Berke & Dan Calacci. Alex is a PhD student at the MIT Media Lab (formerly engineer at Google) whose research includes privacy-preserving ways to leverage big data as a public good and the privacy risks of current data usage models. Dan is a PhD student at MIT studying how data stewardship can enable and influence community governance.

[Original paper by Alex Berke and Dan Calacci]


Overview:  FLoC was a new approach to keep the current internet ad ecosystem profitable without third-party cookies while protecting user privacy. Researchers quickly raised alarm bells about potential privacy issues, and few of them were addressed or explored by researchers or Google. In this paper, we empirically examine the privacy risks raised about FloC, finding that FLoC would have allowed individuals to be tracked across the web, contrary to its core aims.


Introduction

In 2021 Google proposed FLoC as a novel approach to keep the internet ad ecosystem profitable without third-party cookies, and they ran a real-world trial. FLoC was designed to protect privacy on the web while still letting advertisers deliver personalized ads based on people’s behavior. But researchers raised important questions that Google never fully answered. Did FLoC actually protect user privacy? Or was it a troubled solution to a problem that still needs to be solved? We implemented FLoC and empirically tested privacy risks raised by researchers using a dataset of  browsing histories collected from over 90,000 devices in the US. The authors found that contrary to its core aims, FLoC enabled the tracking of individual users across the web, just like the third-party cookie it was meant to replace.

Key Insights

FLoC and Privacy Concerns

The current state of privacy on the web is dismal. Third party cookies and trackers enabled by them allow advertisers, ad-tech firms, and other actors to track everyday users’ activities across sites and contexts. Major players in the online ad space have been considering alternatives as browsers phase out third-party cookies completely. One such alternative was FLoC, which stands for “Federated Learning of Cohorts” (despite no Federated Learning in the technology), premised on creating cohorts of users with similar behavioral characteristics. This approach was designed to allow targeted advertising while protecting individual user privacy. Did it? We empirically tested the privacy risks of FLoC and found that it could enable tracking users across the web, contrary to its aims.

FLoC’s Approach

FLoC was premised on creating cohorts of users with similar browsing habits. When a user visited a web page, that site could then query the user’s “cohort ID”. This cohort ID is associated with some behavioral characteristics common across all users in that cohort. Using this cohort ID, advertisers could then create targeted ads based on cohort IDs, without having to track or identify individual users. 

But researchers quickly raised privacy concerns with this approach. How much information does a cohort ID leak about a user? Could an attacker infer a user’s race, age or gender from their cohort ID? If my cohort ID changes over time, how likely am I to be identified based on my cohort ID? Google released a report detailing some answers to these questions based on an early trial, but it was severely limited.

Empirically testing privacy

The authors sought to answer these questions through an empirical study of FLoC. While browser users were assigned a cohort ID for a particular 7-day period, cohort IDs change over time. Your browsing history from last week differs from your browsing history this week. This raises the risk that sequences of cohort IDs for a particular user might be unique. If they were, then a first-party, such as a publisher, could track your cohort ID over time to uniquely identify you. We tested this by computing cohort IDs for over 90,000 unique devices over a year. We found that more than 50% of devices were uniquely identifiable after only 3 weeks, and more than 95% were identifiable after 4 weeks. Only a few weeks into using FLoC, publishers could uniquely identify users, sharing sensitive information about a browser user with adtech companies, third parties, and other actors.

Another major risk raised by privacy experts was the concern that cohort IDs might leak sensitive information about users, such as their race. If a cohort has unusually more members of a particular race, then that cohort ID could be used to target browser users based on their race, facilitating predatory or discriminatory advertising. The authors tested this risk by examining race in cohorts using a measure called t-closeness. A cohort is said to have satisfied t-closeness if for every racial group, the distribution of users in that cohort differs at most t% from the overall population. Surprisingly, although the authors found major differences in browsing behavior by race, FLoC did not cluster users based on race any more than would happen by chance. This is good, as it shows that FLoC was not discriminatory on the authors’ dataset, but there is no guarantee that this result would hold with the entire population of Chrome users, for example.

Recommendations

The uniqueness results show the risks of any ad system predicated on tracking individual user behavior. Instead, the authors recommend moving back the clock to an advertising ecosystem based on webpage content, rather than long trails of user behavior. For years, this was the dominant paradigm for serving ads, and provides a clear alternative to systems that require detailed user tracking. While this may require a large shift in the ad-serving ecosystem, content-based advertising strategies sidestep entirely the privacy issues the authors raise. 

The authors also call for future designers of systems like FLoC to incorporate a broader array of sensitivity analyses, particularly those that include demographic information. Google tested the sensitivity of their algorithm using only browsing data, which the authors did not find robust enough to test for real-world risk.

Between the lines

Any projects that aim to develop new technologies for targeting advertisements need a thorough and rigorous privacy vetting. The authors provide a roadmap of how to do this through empirically analyzing FLoC, a technology that was proposed by Google to facilitate interest-based advertising without the privacy risks of third party cookies. The authors showed that, contrary to its core aims, FLoC enables the tracking of individual users across sites, allowing more than 95% of users to be uniquely identified after only 4 weeks. They also show that although racial background does mediate differences in browsing behavior, FloC does not cluster users by race. 

Some questions remain, however. The dataset the authors use was small in comparison to Google’s trials. What would the results of their analysis look like when applied to a broader browsing population? Would t-closeness still not be violated? Google is now proposing a new alternative, Topics, which is also predicated on clustering users by browsing behavior. How could these tests be applied to their new approach?

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

AI Policy Corner: New York City Local Law 144

Canada’s Minister of AI and Digital Innovation is a Historic First. Here’s What We Recommend.

Am I Literate? Redefining Literacy in the Age of Artificial Intelligence

AI Policy Corner: The Texas Responsible AI Governance Act

AI Policy Corner: Singapore’s National AI Strategy 2.0

related posts

  • Routing with Privacy for Drone Package Delivery Systems

    Routing with Privacy for Drone Package Delivery Systems

  • Prediction Sensitivity: Continual Audit of Counterfactual Fairness in Deployed Classifiers

    Prediction Sensitivity: Continual Audit of Counterfactual Fairness in Deployed Classifiers

  • Compute Trends Across Three Eras of Machine Learning

    Compute Trends Across Three Eras of Machine Learning

  • AI Ethics Maturity Model

    AI Ethics Maturity Model

  • Exploring XAI for the Arts: Explaining Latent Space in Generative Music

    Exploring XAI for the Arts: Explaining Latent Space in Generative Music

  • The Design Space of Generative Models

    The Design Space of Generative Models

  • Evolution in Age-Verification Applications: Can AI Open Some New Horizons?

    Evolution in Age-Verification Applications: Can AI Open Some New Horizons?

  • The struggle for recognition in the age of facial recognition technology

    The struggle for recognition in the age of facial recognition technology

  • A Snapshot of the Frontiers of Fairness in Machine Learning (Research Summary)

    A Snapshot of the Frontiers of Fairness in Machine Learning (Research Summary)

  • Research summary: Politics of Adversarial Machine Learning

    Research summary: Politics of Adversarial Machine Learning

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer

Categories


• Blog
• Research Summaries
• Columns
• Core Principles of Responsible AI
• Special Topics

Signature Content


• The State Of AI Ethics

• The Living Dictionary

• The AI Ethics Brief

Learn More


• About

• Open Access Policy

• Contributions Policy

• Editorial Stance on AI Tools

• Press

• Donate

• Contact

The AI Ethics Brief (bi-weekly newsletter)

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.


Archive

  • © MONTREAL AI ETHICS INSTITUTE. All rights reserved 2024.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.