Microsoft AI Red Team Discovers Security and Responsible AI Risks in Generative AI Systems
Experts in responsible AI, security, and adversarial machine learning form the formidable Microsoft AI Red Team. In their quest to uncover potential dangers and vulnerabilities within artificial intelligence (AI) systems, the team has employed PyRIT as their trusted tool.
While red teaming classical software or AI systems may have its challenges, red teaming generative AI systems introduces a new level of complexity and risk. The Microsoft AI Red Team has recognized that this practice involves not only security risks but also responsible AI risks. They have dedicated themselves to addressing these risks and ensuring the ethical application of AI across various industries.
Microsoft’s commitment to democratizing AI security is evident in their efforts to provide businesses with the knowledge and resources they need to responsibly innovate with AI. The AI Red Team collaborates with the Office of Responsible AI, Microsoft’s cross-company program on AI Ethics and Effects (AETHER), and the Fairness Center in Microsoft Research. Together, they aim to map AI threats, quantify the associated risks, and develop mitigations to minimize their impact.
PyRIT has been battle-tested by the AI Red Team, evolving from a collection of standalone scripts to a sophisticated tool equipped with essential features. Microsoft’s extensive red teaming experience with different generative AI systems and risk assessments has informed the development and refinement of PyRIT.
It is crucial to note that PyRIT is not intended to replace human red teaming in the context of generative AI systems. Instead, it harnesses the domain knowledge of AI red teamers to automate repetitive tasks. Using PyRIT, security professionals can identify potential areas of concern and conduct thorough investigations. While the security professional retains full control of the strategy and execution of the AI red team operation, PyRIT provides the automation code necessary to leverage harmful prompts and generate even more detrimental ones using the LLM endpoint.
One of Microsoft’s key findings is that red teaming generative AI systems involves both security risks and responsible AI risks, which sets it apart from traditional software or AI systems. The AI red teaming process must simultaneously evaluate the security and AI failure risks to ensure a comprehensive assessment.
Another notable observation is that red teaming generative AI systems is more probabilistic compared to standard red teaming practices. This probabilistic nature may be attributed to factors such as app-specific logic, the generative AI model, the controlling orchestrator, extensibility or plugins, and even language. The slightest modifications can yield various results, further emphasizing the need for careful analysis and assessment.
Generative AI systems vary in architecture, ranging from standalone applications to integrated components in existing applications, encompassing text, audio, photos, and videos. The diverse nature of these systems poses a triple threat when it comes to manual red team probing. To address this challenge, Microsoft introduced a red team automation framework for conventional machine learning systems in 2021 and subsequently developed a new toolkit that enables security professionals to effectively red team generative AI systems.
Microsoft’s AI Red Team’s continuous efforts in advancing responsible AI and addressing security risks are commendable. By identifying potential threats and developing robust solutions, they are playing a crucial role in ensuring the safe and ethical use of AI technology in various industries.
Analyst comment
Positive news. The market can expect increased confidence in the use of AI systems as Microsoft’s AI Red Team discovers and addresses security and responsible AI risks in generative AI systems. This will lead to the development and refinement of tools like PyRIT, enhancing the automation and effectiveness of red teaming in evaluating and minimizing risks associated with AI.