• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • Core Principles of Responsible AI
    • Accountability
    • Fairness
    • Privacy
    • Safety and Security
    • Sustainability
    • Transparency
  • Special Topics
    • AI in Industry
    • Ethical Implications
    • Human-Centered Design
    • Regulatory Landscape
    • Technical Methods
  • Living Dictionary
  • State of AI Ethics
  • AI Ethics Brief
  • 🇫🇷
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

Privacy Limitations Of Interest-based Advertising On The Web: A Post-mortem Empirical Analysis Of Google’s FLoC

February 21, 2022

🔬 Research Summary by Alex Berke & Dan Calacci. Alex is a PhD student at the MIT Media Lab (formerly engineer at Google) whose research includes privacy-preserving ways to leverage big data as a public good and the privacy risks of current data usage models. Dan is a PhD student at MIT studying how data stewardship can enable and influence community governance.

[Original paper by Alex Berke and Dan Calacci]


Overview:  FLoC was a new approach to keep the current internet ad ecosystem profitable without third-party cookies while protecting user privacy. Researchers quickly raised alarm bells about potential privacy issues, and few of them were addressed or explored by researchers or Google. In this paper, we empirically examine the privacy risks raised about FloC, finding that FLoC would have allowed individuals to be tracked across the web, contrary to its core aims.


Introduction

In 2021 Google proposed FLoC as a novel approach to keep the internet ad ecosystem profitable without third-party cookies, and they ran a real-world trial. FLoC was designed to protect privacy on the web while still letting advertisers deliver personalized ads based on people’s behavior. But researchers raised important questions that Google never fully answered. Did FLoC actually protect user privacy? Or was it a troubled solution to a problem that still needs to be solved? We implemented FLoC and empirically tested privacy risks raised by researchers using a dataset of  browsing histories collected from over 90,000 devices in the US. The authors found that contrary to its core aims, FLoC enabled the tracking of individual users across the web, just like the third-party cookie it was meant to replace.

Key Insights

FLoC and Privacy Concerns

The current state of privacy on the web is dismal. Third party cookies and trackers enabled by them allow advertisers, ad-tech firms, and other actors to track everyday users’ activities across sites and contexts. Major players in the online ad space have been considering alternatives as browsers phase out third-party cookies completely. One such alternative was FLoC, which stands for “Federated Learning of Cohorts” (despite no Federated Learning in the technology), premised on creating cohorts of users with similar behavioral characteristics. This approach was designed to allow targeted advertising while protecting individual user privacy. Did it? We empirically tested the privacy risks of FLoC and found that it could enable tracking users across the web, contrary to its aims.

FLoC’s Approach

FLoC was premised on creating cohorts of users with similar browsing habits. When a user visited a web page, that site could then query the user’s “cohort ID”. This cohort ID is associated with some behavioral characteristics common across all users in that cohort. Using this cohort ID, advertisers could then create targeted ads based on cohort IDs, without having to track or identify individual users. 

But researchers quickly raised privacy concerns with this approach. How much information does a cohort ID leak about a user? Could an attacker infer a user’s race, age or gender from their cohort ID? If my cohort ID changes over time, how likely am I to be identified based on my cohort ID? Google released a report detailing some answers to these questions based on an early trial, but it was severely limited.

Empirically testing privacy

The authors sought to answer these questions through an empirical study of FLoC. While browser users were assigned a cohort ID for a particular 7-day period, cohort IDs change over time. Your browsing history from last week differs from your browsing history this week. This raises the risk that sequences of cohort IDs for a particular user might be unique. If they were, then a first-party, such as a publisher, could track your cohort ID over time to uniquely identify you. We tested this by computing cohort IDs for over 90,000 unique devices over a year. We found that more than 50% of devices were uniquely identifiable after only 3 weeks, and more than 95% were identifiable after 4 weeks. Only a few weeks into using FLoC, publishers could uniquely identify users, sharing sensitive information about a browser user with adtech companies, third parties, and other actors.

Another major risk raised by privacy experts was the concern that cohort IDs might leak sensitive information about users, such as their race. If a cohort has unusually more members of a particular race, then that cohort ID could be used to target browser users based on their race, facilitating predatory or discriminatory advertising. The authors tested this risk by examining race in cohorts using a measure called t-closeness. A cohort is said to have satisfied t-closeness if for every racial group, the distribution of users in that cohort differs at most t% from the overall population. Surprisingly, although the authors found major differences in browsing behavior by race, FLoC did not cluster users based on race any more than would happen by chance. This is good, as it shows that FLoC was not discriminatory on the authors’ dataset, but there is no guarantee that this result would hold with the entire population of Chrome users, for example.

Recommendations

The uniqueness results show the risks of any ad system predicated on tracking individual user behavior. Instead, the authors recommend moving back the clock to an advertising ecosystem based on webpage content, rather than long trails of user behavior. For years, this was the dominant paradigm for serving ads, and provides a clear alternative to systems that require detailed user tracking. While this may require a large shift in the ad-serving ecosystem, content-based advertising strategies sidestep entirely the privacy issues the authors raise. 

The authors also call for future designers of systems like FLoC to incorporate a broader array of sensitivity analyses, particularly those that include demographic information. Google tested the sensitivity of their algorithm using only browsing data, which the authors did not find robust enough to test for real-world risk.

Between the lines

Any projects that aim to develop new technologies for targeting advertisements need a thorough and rigorous privacy vetting. The authors provide a roadmap of how to do this through empirically analyzing FLoC, a technology that was proposed by Google to facilitate interest-based advertising without the privacy risks of third party cookies. The authors showed that, contrary to its core aims, FLoC enables the tracking of individual users across sites, allowing more than 95% of users to be uniquely identified after only 4 weeks. They also show that although racial background does mediate differences in browsing behavior, FloC does not cluster users by race. 

Some questions remain, however. The dataset the authors use was small in comparison to Google’s trials. What would the results of their analysis look like when applied to a broader browsing population? Would t-closeness still not be violated? Google is now proposing a new alternative, Topics, which is also predicated on clustering users by browsing behavior. How could these tests be applied to their new approach?

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

AI Policy Corner: Frontier AI Safety Commitments, AI Seoul Summit 2024

AI Policy Corner: The Colorado State Deepfakes Act

Special Edition: Honouring the Legacy of Abhishek Gupta (1992–2024)

AI Policy Corner: The Turkish Artificial Intelligence Law Proposal

From Funding Crisis to AI Misuse: Critical Digital Rights Challenges from RightsCon 2025

related posts

  • 2022 AI Index Report - Technical AI Ethics Chapter

    2022 AI Index Report - Technical AI Ethics Chapter

  • Moral Zombies: Why Algorithms Are Not Moral Agents

    Moral Zombies: Why Algorithms Are Not Moral Agents

  • Research summary: Working Algorithms: Software Automation and the Future of Work

    Research summary: Working Algorithms: Software Automation and the Future of Work

  • Public Strategies for Artificial Intelligence: Which Value Drivers?

    Public Strategies for Artificial Intelligence: Which Value Drivers?

  • Automating Informality: On AI and Labour in the Global South (Research Summary)

    Automating Informality: On AI and Labour in the Global South (Research Summary)

  • Research summary: The Toxic Potential of YouTube's Feedback Loop

    Research summary: The Toxic Potential of YouTube's Feedback Loop

  • Eticas Foundation external audits VioGén: Spain’s algorithm designed to protect victims of gender vi...

    Eticas Foundation external audits VioGén: Spain’s algorithm designed to protect victims of gender vi...

  • Report on Publications Norms for Responsible AI

    Report on Publications Norms for Responsible AI

  • Can Large Language Models Provide Security & Privacy Advice? Measuring the Ability of LLMs to Re...

    Can Large Language Models Provide Security & Privacy Advice? Measuring the Ability of LLMs to Re...

  • HAI Weekly Seminar Series: Decolonizing AI with Sabelo Mhlambi

    HAI Weekly Seminar Series: Decolonizing AI with Sabelo Mhlambi

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer

Categories


• Blog
• Research Summaries
• Columns
• Core Principles of Responsible AI
• Special Topics

Signature Content


• The State Of AI Ethics

• The Living Dictionary

• The AI Ethics Brief

Learn More


• About

• Open Access Policy

• Contributions Policy

• Editorial Stance on AI Tools

• Press

• Donate

• Contact

The AI Ethics Brief (bi-weekly newsletter)

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.


Archive

  • © MONTREAL AI ETHICS INSTITUTE. All rights reserved 2024.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.