• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • Core Principles of Responsible AI
    • Accountability
    • Fairness
    • Privacy
    • Safety and Security
    • Sustainability
    • Transparency
  • Special Topics
    • AI in Industry
    • Ethical Implications
    • Human-Centered Design
    • Regulatory Landscape
    • Technical Methods
  • Living Dictionary
  • State of AI Ethics
  • AI Ethics Brief
  • šŸ‡«šŸ‡·
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

Technical methods for regulatory inspection of algorithmic systems in social media platforms

October 4, 2022

šŸ”¬ Research Summary by Max Krueger, a consultant at Accenture with an interest in both the long and short-term implications of AI on society.

[Original paper by Aneesh Pappu, Ada Lovelace Institute]


Overview: Ā A key ingredient to ensuring the responsible use of AI systems is robust methods for ensuring algorithms meet regulatory obligations. Ada Lovelace Institute outlines six technical approaches to audit algorithmic systems.


Introduction

Regulatory inspection of content recommendation and moderation algorithms is gaining increasing popularity in government and society. While there is agreement that regulation is an important component of a healthy and safe internet there is little consensus about how to demonstrate regulatory compliance or achieve regulatory inspection. The Ada Lovelace institute outlines 6 methods through which regulatory agencies can audit recommendation and moderation algorithms. Each of these methods has strengths and shortcomings. The method of audit should be determined by the end goal and what findings might demonstrate compliance or lack thereof.Ā 

Key Insights

Regulators need concrete technical mechanisms to audit content recommendation and moderation systems. Ada Lovelace Institute identified 6 through which this could be accomplished.

Code Audits

Code audits provide direct access to the code base of the algorithm. In theory, code audits provide auditors with the most information regarding an algorithm but in practice, this is a complex endeavor, complicated by the size and intricacy of most algorithm codebases. As noted, ā€œindividual engineers in large companies rarely understand how all parts of a platform operateā€. The key to code audits is the ability to remove the signal from noise to isolate important features within a codebase. Additionally, auditors might be given differing levels of access across the codebase:

At the lowest level of system access, the auditor has no ability to directly call or run the algorithms of interest (and this is the level of access for the majority of research surveyed in this article), and at the highest level of access, the auditor has full information on the learning objective (the objective the system was trained to optimise), the ability to directly run the algorithm and access to the input data used to train the system, among other types of access.

A significant shortfall of code audits is that misbehavior is seldom explicitly coded in the algorithm. As a result, auditors are unlikely to find issues with specific lines of code. As the report describes, ā€œinformation gleaned from a code audit is likely to be equivalent to the information that can be learned from interviews with technical and product teams…ā€ Code audits are most valuable when determining engineers’ intent and are most efficient when kept at plain text descriptions of code (instead of at the line-of-code level). Ultimately, code audits are complex and time-consuming and may not yield beneficial results to regulators.

User Surveys

User surveys provide a method of data collection directly from platform users.

Surveys are effective at gathering information about user experience on a platform. Survey data can help paint a rough picture of the kinds of problematic behaviour that should then be further investigated in an inspection.

User surveys must collect data from diverse individuals to be effective. In short, user surveys can be used to identify problem areas but must be accompanied by an inspection to be effective.

Scraping Audit

A black-box method of auditing, scraping audits aim to collect data directly from the platforms without necessarily commissioning users to engage with the platform. Scraping is generally done by writing custom code to collect data through a web browser. This allows regulators to see the output of an algorithm without understanding the reasoning behind the output. Scraping can have shortfalls as updates to the user interface necessitate changes to the scraping code.

Scraping algorithms can help collect ā€œdata on a platform that can be analysed to observe statistical differences between different groups. (For example, a scraping study which used data collected from scraping to analyse correlations between the gender of a worker and their ranking on a job’s platform).ā€ While not suitable for investigating causation, scraping audits can help build datasets on publicly available information to understand how changes occur over time.

API Audit

An API audit involves sending specific requests to an application programming interface (API) to collect data that would traditionally be collected via a scraping audit. An API audit is a step up from a scraping audit as it does not interact directly with the user interface (UI) but with the underlying data and, therefore, is less susceptible to breaking when changes are made to the UI. API audits can be used to obtain data ā€œsuited to descriptive analysis and correlational studies focused on observing patterns in the outputs of a system.ā€ A regulator could request specific data from an API at regular intervals to observe changes over time. Of course, there are numerous technical hurdles to achieving an audit via API, but this may offer a standardized route for tech firms to demonstrate regulatory compliance.

Sock-puppet Audit

Sock-puppet audits seek to impersonate users on a given platform in an automated fashion. Sock-puppet audits allow for customization between automated users, which allows for experimentation. This can be a helpful resource if regulators seek to understand how an algorithm responds to different personas. For example:

An online-safety inspection using sock puppets could involve creating sock puppets to impersonate users from different demographics (for instance, under-18 users) to use the platform and record the content recommended to them. This content could then be analysed to determine whether the amount of harmful content on the platform showed to these sensitive users is compliant with online-safety expectations.

Sock-puppet audits provide a distinct advantage by allowing regulators to manipulate personas, but, as with all audits, the data collected are just a sample of the overall picture. If not administered correctly, auditors may draw incorrect conclusions.

Crowd-sourced Audit

Crowd-sourced audits have the potential to be the most promising of audit mechanisms: a crowd-sourced audit is theoretically similar to a sock-puppet audit, but instead of using automated personas, regulators use actual platform users to gather data. An example of this is the Citizen Browser project by The Markup. Crowd-sourced audits provide a number of advantages:

It avoids the need to inspect source code, which is a manually intensive task demanding a large amount of expertise on the behalf of the regulator, the need to survey users (as crowd-sourced audits should automatically collect data), and terms of service breaches that scraping and/or sock-puppet audits might encounter

Crowd-source audits are typically administered via browser extension, resulting in minimal interruption for the platform user. Crowd-source audits can benefit regulators as they collect data on actual end-user experience and, therefore, can be analyzed for compliance with government regulation.

Conclusion

Efficient audit mechanisms have great importance to regulators and technology firms. Each of the audit mechanisms above suits a specific purpose and should be selected based on the explicit end goal of the audit. Ultimately, an ecosystem must be developed to allow companies to efficiently demonstrate compliance with legislation while providing a level of privacy for both the firm and end-user. The foundation of this ecosystem inspection is comprehensive and well-developed policy.

Between the lines

The fields of algorithm auditing and AI regulation are still very much in their infancy. Significant research remains to fully understand how an auditing ecosystem can be developed to allow transparency into moderation and recommendation systems efficiently. While the above audit mechanisms have been identified explicitly for regulators, technology firms would benefit from implementing internal practices using these mechanisms. This practice would provide tech companies the ability to understand how their algorithm is performing in the ā€œwildā€ and understand how it operates from a regulatory perspective avoiding potential legal and customer trust issues.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

šŸ” SEARCH

Spotlight

AI Policy Corner: Frontier AI Safety Commitments, AI Seoul Summit 2024

AI Policy Corner: The Colorado State Deepfakes Act

Special Edition: Honouring the Legacy of Abhishek Gupta (1992–2024)

AI Policy Corner: The Turkish Artificial Intelligence Law Proposal

From Funding Crisis to AI Misuse: Critical Digital Rights Challenges from RightsCon 2025

related posts

  • A Survey on Intersectional Fairness in Machine Learning: Notions, Mitigation and Challenges

    A Survey on Intersectional Fairness in Machine Learning: Notions, Mitigation and Challenges

  • A Machine Learning Challenge or a Computer Security Problem?

    A Machine Learning Challenge or a Computer Security Problem?

  • Reports on Communication Surveillance in Botswana, Malawi and the DRC, and the Chinese Digital Infra...

    Reports on Communication Surveillance in Botswana, Malawi and the DRC, and the Chinese Digital Infra...

  • Visions of Artificial Intelligence and Robots in Science Fiction: a computational analysis

    Visions of Artificial Intelligence and Robots in Science Fiction: a computational analysis

  • Research summary: What does it mean for ML to be trustworthy?

    Research summary: What does it mean for ML to be trustworthy?

  • Achieving a ā€˜Good AI Society’: Comparing the Aims and Progress of the EU and the US

    Achieving a ā€˜Good AI Society’: Comparing the Aims and Progress of the EU and the US

  • The State of AI Ethics Report (Volume 5)

    The State of AI Ethics Report (Volume 5)

  • A Prompt Array Keeps the Bias Away: Debiasing Vision-Language Models with Adversarial Learning

    A Prompt Array Keeps the Bias Away: Debiasing Vision-Language Models with Adversarial Learning

  • Fairness implications of encoding protected categorical attributes

    Fairness implications of encoding protected categorical attributes

  • 2022 AI Index Report - Technical AI Ethics Chapter

    2022 AI Index Report - Technical AI Ethics Chapter

Partners

  • Ā 
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer

Categories


• Blog
• Research Summaries
• Columns
• Core Principles of Responsible AI
• Special Topics

Signature Content


• The State Of AI Ethics

• The Living Dictionary

• The AI Ethics Brief

Learn More


• About

• Open Access Policy

• Contributions Policy

• Editorial Stance on AI Tools

• Press

• Donate

• Contact

The AI Ethics Brief (bi-weekly newsletter)

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.


Archive

  • Ā© MONTREAL AI ETHICS INSTITUTE. All rights reserved 2024.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.