A Red Team Tool for Generative AI Systems

Microsoft has taken a decisive step to enhance the security of generative AI systems by introducing a new open automation framework known as PyRIT (Python Risk Identification Toolkit). This innovative toolkit provides security professionals and machine learning engineers with the tools to identify and address risks in generative AI systems proactively.

Emphasizing the importance of collaboration in security practices and the associated responsibilities of generative AI, Microsoft is committed to providing support to organizations worldwide in responsibly innovating with the latest AI technologies. PyRIT, in conjunction with Microsoft’s ongoing investments in AI red teaming since 2019, demonstrates the company’s dedication to democratizing AI security for customers, partners, and the broader community.

The evolution of AI red teaming has been a complex and multidisciplinary process, requiring expertise in security, adversarial machine learning, and responsible AI. Microsoft’s AI Red Team comprises professionals from various domains within the company, drawing on resources from the Fairness Center in Microsoft Research, AETHER (AI Ethics and Effects in Engineering and Research), and the Office of Responsible AI.

Over the past year, Microsoft has actively engaged in red teaming high-value generative AI systems and models before their deployment to customers. This experience has showcased the unique challenges of red teaming generative AI systems, which differ significantly from traditional software or classical AI systems. Addressing security and responsible AI risks simultaneously, navigating the probabilistic nature of generative AI, and understanding the diverse architectures of these systems have been key focus areas for Microsoft.

PyRIT, initially conceived as a set of scripts used by the Microsoft AI Red Team in 2022, has evolved into a comprehensive toolkit designed to address various risks identified during red teaming exercises. The toolkit facilitates the rapid generation and evaluation of malicious prompts and responses, increasing the efficiency of red teaming operations.

Designed with abstraction and extensibility in mind, PyRIT supports a variety of generative AI target formulations and modalities. The toolkit integrates with models from Microsoft Azure OpenAI Service, Hugging Face, and Azure Machine Learning Managed Online Endpoint. It includes a scoring engine that can utilize classical machine learning classifiers or leverage an LLM endpoint for self-evaluation, as well as support for single and multi-turn attack strategies.

Moving forward, Microsoft encourages industry peers to explore PyRIT and consider its adaptation for red teaming their generative AI applications. To facilitate this, Microsoft has provided demonstrations and is collaborating with the Cloud Security Alliance to showcase PyRIT’s capabilities.

The release of PyRIT signifies a significant advancement in Microsoft’s efforts to map, measure, and mitigate AI risks, ultimately contributing to a safer and more responsible AI ecosystem. For more information on Microsoft’s AI Red Team and resources for securing AI, interested parties can access Microsoft Secure online to learn about product innovations that promote the safe, responsible, and secure use of AI.

Source link

Select a plan

Monthly plan

Yearly plan

All plans include

Search for an article

A Red Team Tool for Generative AI Systems

Latest articles

CISA Calls for Immediate Hardening of SharePoint Amid Rising Exploits

Single Prompt Empowers ChatGPT to Perform Complete Cyber-Attack Sequence

AI Appreciation Day: Security Leaders Suggest a Conditional Celebration

1 in 3 AI Agents Have Security Flaws: Is Information Security Prepared for the Next Supply Chain Attack?

More like this

CISA Calls for Immediate Hardening of SharePoint Amid Rising Exploits

Single Prompt Empowers ChatGPT to Perform Complete Cyber-Attack Sequence

AI Appreciation Day: Security Leaders Suggest a Conditional Celebration