Research summary: Adversarial Machine Learning - Industry Perspectives

Mini summary (scroll down for full summary):

An emerging area of concern for companies that are seeing heavy deployments of ML systems in the industry is cybersecurity. There are many emergent risks that are a departure from traditional cybersecurity practice that need to be addressed when applying insights to the field of ML. The authors of this study surveyed two key personas: ML engineers who were responsible for the development and deployment of ML systems and security first incident responders. While a lot of them recognized the concerns raised by the authors, they didn’t have clarity in the mechanisms and techniques that they could deploy to secure the system against some of these potential threats. Most concerned themselves with data poisoning attacks on the systems and paid less attention to other areas such as model inversion, adversarial examples in the physical domains, and supply chain related vulnerabilities. Privacy breaches and intellectual property related thefts were at the top of the minds again because of their primacy in wider discussions thus focusing attention on a smaller subset of potential attacks which leaves some of the attack surface open.

The authors (Kumar et al.) make a strong case for borrowing some of the best practices from the field of traditional cybersecurity like utilizing shared vulnerability databases, a common standard for scoring risks, and better secure coding practices as associated with the popular ML development frameworks. They also observed a certain degree of relegation of responsibility when it came to using ML as a service from cloud providers where downstream applications developers didn’t realize that some of the cybersecurity measures fell on their shoulders in terms of managing the risks. While there remain many challenges in successfully replicating the state of cybersecurity and improving on it from traditional software infrastructure, given that ML is the new software and there is increasing adoption, the authors call on the community to start paying serious attention to the concerns raised and building up capabilities to manage and combat the risks posed by cybersecurity threats that are geared towards the ML aspects of the larger software industry.

Full summary:

There is mounting evidence that organizations are taking seriously the threats arising from malicious actors geared towards attacking ML systems. This is supported by the fact that organizations like ISO and NIST are building up frameworks for guidance on securing ML systems, that working groups from the EU have put forth concrete technical checklists for the evaluating the trustworthiness of ML systems and that ML systems are becoming key to the functioning of organizations and hence they are inclined to protect their crown jewels.

The organizations surveyed as a part of this study spanned a variety of domains and were limited to those that have mature ML development. The focus was on two personas: ML engineers who are building these systems and security incident responders whose task is to secure the software infrastructure including the ML systems. Depending on the size of the organization, these people could be in different teams, same team or even the same person. The study was also limited to intentional malicious attacks and didn’t investigate the impacts of naturally occurring adversarial examples, distributional shifts, common corruption and reward hacking.

Most organizations that were surveyed as a part of the study were found to primarily be focused on traditional software security and didn’t have the right tools or know-how in securing against ML attacks. They also indicated that they were actively seeking guidance in the space. Most organizations were clustered around concerns regarding data poisoning attacks which was probably the case because of the cultural significance of the Tay chatbot incident. Additionally, privacy breaches were another significant concern followed by concerns around model stealing attacks that can lead to the loss of intellectual property. Other attacks such as attacking the ML supply chain and adversarial examples in the physical domain didn’t catch the attention of the people that were surveyed as a part of the study.

One of the gaps between reality and expectations was around the fact that security incident responders and ML engineers expected that the libraries that they are using for ML development are battle-tested before being put out by large organizations, as is the case in traditional software. Also, they pushed upstream the responsibility of security in the cases where they were using ML as a service from cloud providers. Yet, this ignores the fact that this is an emergent field and that a lot of the concerns need to be addressed in the downstream tasks that are being performed by these tools. They also didn’t have a clear understanding of what to expect when something does go wrong and what the failure mode would look like.

In traditional software security, MITRE has a curated repository of attacks along with detection cues, reference literature and tell-tale signs for which malicious entities, including nation state attackers are known to use these attacks. The authors call for a similar compilation to be done in the emergent field of adversarial machine learning whereby the researchers and practitioners register their attacks and other information in a curated repository that provides everyone with a unified view of the existing threat environment.

While programming languages often have well documented guidelines on secure coding, guidance on doing so with popular ML frameworks like PyTorch, Keras and Tensorflow is sparse. Amongst these, Tensorflow is the only one that provides some tools for testing against adversarial attacks and some guidance on how to do secure coding in the ML context.

Security Development Lifecycle (SDL) provides guidance on how to secure systems and scores vulnerabilities and provides some best practices, but applying this to ML systems might allow imperfect solutions to exist. Instead of looking at guidelines as providing a strong security guarantee, the authors advocate for having code examples that showcase what constitutes security- and non-security-compliant ML development.

In traditional software security there are tools for static code analysis that provide guidance on the security vulnerabilities prior to the code being committed to a repository or being executed while dynamic code analysis finds security vulnerabilities by executing the different code paths and detecting vulnerabilities at runtime. There are some tools like mlsec and cleverhans that provide white- and black-box testing; one of the potential future directions for research is to extend this to the cases of model stealing, model inversion, and membership inference attacks. Including these tools as a part of the IDE would further make it naturalized for developers to think about secure coding practices in the ML context.

Adapting the audit and logging requirements as necessitated for the functionality of the Security Information and Event Management (SIEM) system, in the field of ML, one can execute the list of attacks as specified in literature and ensure that the logging artifacts generated as a consequence are traced to an attack. Then, having these incident logs be in a format that is exportable and integratable with SIEM systems will allow forensic experts to analyze them post-hoc for hardening and analysis. Standardizing the reporting, logging and documentation as done by the Sigma format in traditional software security will allow the insights from one analyst into defenses for many others. Automating the possible attacks and including them as a part of the MLOps pipeline is something that will enhance the security posture of the systems and make them pedestrian practice in the SDL. Red teaming, as done in security testing, can be applied to assess the business impacts and likelihood of threat, something that is considered best practice and is often a requirement for supplying critical software to different organizations like the US government.

Transparency centers that allow for deep code inspection and help create assurance on the security posture of a software product/service can be extended to ML which would have to cover three modalities: ML platform is implemented in a secure manner, ML as a service meets the basic security and privacy requirements, and that the ML models embedded on edge devices meet basic security requirements. Tools that build on formal verification methods will help to enhance this practice.

Tracking and scoring ML vulnerabilities akin to how they are done in software security testing done by registering identified vulnerabilities into a common database like CVE and then assigning it an impact score like the CVSS needs to be done for the field of ML. While the common database part is easy to set up, scoring them isn’t something that has been figured out yet. Additionally, on being alerted that a new vulnerability has been discovered, it isn’t clear how the ML infrastructure can be scanned to see if the system is vulnerable to that.

Because of the deep integration of ML systems within the larger product/service, the typical practice of identifying a blast radius and containment strategy that is applied to traditional software infrastructure when alerted of a vulnerability is hard to define and apply. Prior research work from Google has identified some ways to qualitatively assess the impacts in a sprawling infrastructure.

From a forensic perspective, the authors put forth several questions that one can ask to guide the post-hoc analysis, the primary problem there is that only some of the learnings from traditional software protection and analysis apply here, there are many new artifacts, paradigmatic, and environmental aspects that need to be taken into consideration. From a remediation perspective, we need to develop metrics and ways to ascertain that patched models and ML systems can maintain prior levels of performance while having mitigated the attacks that they were vulnerable to, the other thing to pay attention is that there aren’t any surfaces that are opened up for attack. Given that ML is going to be the new software, we need to think seriously about inheriting some of the security best practices from the world of traditional cybersecurity to harden defenses in the field of ML.

Original piece by Kumar et al.: https://arxiv.org/abs/2002.05646