🔬 Research Summary by Max Krueger, a consultant at Accenture with an interest in both the long and short-term implications of AI on society.
[Original paper by Aneesh Pappu, Ada Lovelace Institute]
Overview: Â A key ingredient to ensuring the responsible use of AI systems is robust methods for ensuring algorithms meet regulatory obligations. Ada Lovelace Institute outlines six technical approaches to audit algorithmic systems.
Introduction
Regulatory inspection of content recommendation and moderation algorithms is gaining increasing popularity in government and society. While there is agreement that regulation is an important component of a healthy and safe internet there is little consensus about how to demonstrate regulatory compliance or achieve regulatory inspection. The Ada Lovelace institute outlines 6 methods through which regulatory agencies can audit recommendation and moderation algorithms. Each of these methods has strengths and shortcomings. The method of audit should be determined by the end goal and what findings might demonstrate compliance or lack thereof.Â
Key Insights
Regulators need concrete technical mechanisms to audit content recommendation and moderation systems. Ada Lovelace Institute identified 6 through which this could be accomplished.
Code Audits
Code audits provide direct access to the code base of the algorithm. In theory, code audits provide auditors with the most information regarding an algorithm but in practice, this is a complex endeavor, complicated by the size and intricacy of most algorithm codebases. As noted, “individual engineers in large companies rarely understand how all parts of a platform operate”. The key to code audits is the ability to remove the signal from noise to isolate important features within a codebase. Additionally, auditors might be given differing levels of access across the codebase:
At the lowest level of system access, the auditor has no ability to directly call or run the algorithms of interest (and this is the level of access for the majority of research surveyed in this article), and at the highest level of access, the auditor has full information on the learning objective (the objective the system was trained to optimise), the ability to directly run the algorithm and access to the input data used to train the system, among other types of access.
A significant shortfall of code audits is that misbehavior is seldom explicitly coded in the algorithm. As a result, auditors are unlikely to find issues with specific lines of code. As the report describes, “information gleaned from a code audit is likely to be equivalent to the information that can be learned from interviews with technical and product teams…” Code audits are most valuable when determining engineers’ intent and are most efficient when kept at plain text descriptions of code (instead of at the line-of-code level). Ultimately, code audits are complex and time-consuming and may not yield beneficial results to regulators.
User Surveys
User surveys provide a method of data collection directly from platform users.
Surveys are effective at gathering information about user experience on a platform. Survey data can help paint a rough picture of the kinds of problematic behaviour that should then be further investigated in an inspection.
User surveys must collect data from diverse individuals to be effective. In short, user surveys can be used to identify problem areas but must be accompanied by an inspection to be effective.
Scraping Audit
A black-box method of auditing, scraping audits aim to collect data directly from the platforms without necessarily commissioning users to engage with the platform. Scraping is generally done by writing custom code to collect data through a web browser. This allows regulators to see the output of an algorithm without understanding the reasoning behind the output. Scraping can have shortfalls as updates to the user interface necessitate changes to the scraping code.
Scraping algorithms can help collect “data on a platform that can be analysed to observe statistical differences between different groups. (For example, a scraping study which used data collected from scraping to analyse correlations between the gender of a worker and their ranking on a job’s platform).” While not suitable for investigating causation, scraping audits can help build datasets on publicly available information to understand how changes occur over time.
API Audit
An API audit involves sending specific requests to an application programming interface (API) to collect data that would traditionally be collected via a scraping audit. An API audit is a step up from a scraping audit as it does not interact directly with the user interface (UI) but with the underlying data and, therefore, is less susceptible to breaking when changes are made to the UI. API audits can be used to obtain data “suited to descriptive analysis and correlational studies focused on observing patterns in the outputs of a system.” A regulator could request specific data from an API at regular intervals to observe changes over time. Of course, there are numerous technical hurdles to achieving an audit via API, but this may offer a standardized route for tech firms to demonstrate regulatory compliance.
Sock-puppet Audit
Sock-puppet audits seek to impersonate users on a given platform in an automated fashion. Sock-puppet audits allow for customization between automated users, which allows for experimentation. This can be a helpful resource if regulators seek to understand how an algorithm responds to different personas. For example:
An online-safety inspection using sock puppets could involve creating sock puppets to impersonate users from different demographics (for instance, under-18 users) to use the platform and record the content recommended to them. This content could then be analysed to determine whether the amount of harmful content on the platform showed to these sensitive users is compliant with online-safety expectations.
Sock-puppet audits provide a distinct advantage by allowing regulators to manipulate personas, but, as with all audits, the data collected are just a sample of the overall picture. If not administered correctly, auditors may draw incorrect conclusions.
Crowd-sourced Audit
Crowd-sourced audits have the potential to be the most promising of audit mechanisms: a crowd-sourced audit is theoretically similar to a sock-puppet audit, but instead of using automated personas, regulators use actual platform users to gather data. An example of this is the Citizen Browser project by The Markup. Crowd-sourced audits provide a number of advantages:
It avoids the need to inspect source code, which is a manually intensive task demanding a large amount of expertise on the behalf of the regulator, the need to survey users (as crowd-sourced audits should automatically collect data), and terms of service breaches that scraping and/or sock-puppet audits might encounter
Crowd-source audits are typically administered via browser extension, resulting in minimal interruption for the platform user. Crowd-source audits can benefit regulators as they collect data on actual end-user experience and, therefore, can be analyzed for compliance with government regulation.
Conclusion
Efficient audit mechanisms have great importance to regulators and technology firms. Each of the audit mechanisms above suits a specific purpose and should be selected based on the explicit end goal of the audit. Ultimately, an ecosystem must be developed to allow companies to efficiently demonstrate compliance with legislation while providing a level of privacy for both the firm and end-user. The foundation of this ecosystem inspection is comprehensive and well-developed policy.
Between the lines
The fields of algorithm auditing and AI regulation are still very much in their infancy. Significant research remains to fully understand how an auditing ecosystem can be developed to allow transparency into moderation and recommendation systems efficiently. While the above audit mechanisms have been identified explicitly for regulators, technology firms would benefit from implementing internal practices using these mechanisms. This practice would provide tech companies the ability to understand how their algorithm is performing in the “wild” and understand how it operates from a regulatory perspective avoiding potential legal and customer trust issues.