Defending AI against deception attacks – Professor Ajmal Saeed Mian

Project: Artificial Intelligence relies on deep learning as its central driving tool. However, deep learning is vulnerable to malicious attacks that can manipulate data or embed Trojans in the deep models to gain full control over the AI system. This project aims to secure AI systems for defense applications by comprehensively addressing its vulnerabilities to attacks. The outcomes are likely to enable the detection of malicious attacks on deep learning, identification of the attack sources, estimation of attackers’ capabilities and cleansing of the model and data of the malignant effects.

Chief Investigators: Professor Ajmal Saeed Mian, Dr Naveed Akhtar, Professor Richard Hartley

Research Associate: Dr Jordan Vice
PhD Student: Max Collins

Acknowledgement: This project is funded by National Intelligence and Security Discovery Research Grant (project# NS220100007) funded by the Department of Defence Australia.

News & Milestones:

Nov 2024: Prof Mubarak Shah discusses our work on quantifying bias in his keynote at the British Machine Vision Conference in Glasgow, UK.
Nov 2024: Uploaded paper titled “Safety Without Semantic Disruptions: Editing-free Safe Image Generation via Context-preserving Dual Latent Reconstruction” to arXiv preprint arXiv:2411.13982.
Nov 2024: Uploaded paper titled “On the Fairness, Diversity and Reliability of Text-to-Image Generative Models” to arXiv preprint arXiv:2411.13981.
Oct 2024: Yunzhuo Chen presented “A statistical image realism score for deepfake detection” at the IEEE International Conference on Image Processing (ICIP) in Abu Dhabi, UAE.
Sep 2024: Jordan Vice presented “Manipulating and Mitigating Generative Model Biases without Retraining” at ECCV Workshop on critical evaluation of generative models and their impact on society in Milan, Italy.
20 Aug 2024: Ajmal Mian talks about security of AI models at Automation, AI, Science, and Policy workshop in UWA law school – attended by Prof Chennupati Jagadish (President, Australian Academy of Science), Anna-Maria Arabia (CEO Australian Academy of Science), Prof Stephen Powles (WA Chair Australian Academy of Science), Catherine Fletcher (WA Information Commissioner)
Aug 2024: Our paper titled “A statistical image realism score for deepfake detection” is accepted in the IEEE International Conference on Image Processing 2024.
Jul 2024: Jordan Vice speaks on the topic “Exploring generative AI in academic research: An introduction to opportunities, risks and ethical considerations” at UWA Webinar.
Jun 2024: Richard Hartley visits UWA to discuss progress and future directions of the project.
May 2024: Ajmal Mian is member of CORE Working Group for the Senate Inquiry on Adopting AI
May 2024: Richard Hartley elected Fellow of the Royal Society
18 Apr 2024: Ajmal Mian gives a talk on “Backdoors & bias in text-to-image generative models” at UCF
Apr 2024: Our BAGM paper accepted for publication in IEEE Transactions on Information Forensics & Security
3 Apr 2024: Uploaded paper titled “Severity Controlled Text-to-Image Generative Model Bias Manipulation”to arXiv https://arxiv.org/abs/2404.02530
27 Mar 2024: Ajmal Mian participates in WA Science & Technology Plan – Advisory Group Meeting
7 Mar 2024: Ajmal Mian participates in WA 10 year science plan discussions
25 Feb 2024: We welcome Max Collins (PhD candidate) to our team.
27 Dec 2023: Jordan Vice received the Hugging Face community GPU grant for the Try-Before-You-Bias.
20 Dec 2023: Videos demonstrating Try-Before-You-Bias uploaded to YouTube as Part 1, Part 2, & Part 3.
20 Dec 2023: Demo (and source code) for quantifying bias in text to image generative models released on Hugging Face (Link). We call it Try-Before-You-Bias.
20 Dec 2023: Uploaded paper titled “Quantifying Bias in Text-to-Image Generative Models” to arXiv:2312.13053.
Dec 2023: Prof Richard Hartley talks on “Geometry of Learning, Transformable Image Distributions” at Uni of Queensland (8 Dec), at QUT (12 Dec) and at Griffith Uni (13 Dec) and discusses our work on BAGM.
30 Nov 2023: Ajmal Mian talks about “Deepfake detection with spatio-temporal consistency and attention” at DICTA 2023.
26 Sep 2023: Uploaded paper titled “On quantifying and improving realism of images generated with diffusion” on arXiv. We propose Image Realism Score (IRS), a non-learning based metric for deep fake detection.
18 – 20 Sep 2023: Prof Richard Hartley visits UWA to discuss the project.
12 Aug 2023: Paper (collaboration with UCF) accepted in 4th Workshop on Adversarial Robustness In the Real World (held with ICCV 2023 in Paris)
10 Aug 2023: Ajmal Mian spoke (as a panelist) on “The Rise of AI” at In Conversation organized by WA Museum.
31 Jul 2023: Uploaded paper titled “BAGM: A Backdoor Attack for Manipulating Text-to-Image Generative Models” by J Vice, N Akhtar, R Hartley, A Mian to arXiv https://arxiv.org/abs/2307.16489.
24 Jul 2023: Advertised Honours/MDS project titled “Backdoor detection in machine learning models”
11 Apr 2023: We welcome Dr Jordan Vice into our team
3-5 Apr 2023: Prof Richard Hartley visits UWA
10 Mar 2023: Welcome talk to Bachelor of Advanced Computer Science students [slides]
27 Jan 2023: Published article in the IAPR Newsletter on backdoor attacks [PDF]
Jan 2023: Advertised PhD scholarship for this project
31 Dec 2022: Project formally signed

Data and Resources:

Our ICIP 2024 data https://ieee-dataport.org/documents/gen-100. 3000 images (100 categories x 30 samples) generated by Stable Diffusion Model (SDM), DALLE-2, Midjourney, and BigGAN (using prompts from ChatGPT).
Trained 36 Trojan/backdoor inserted models (4 models x 3 layers of backdoors x 3 triggers) plus additional rare trigger models for comparison to an existing techniques. Code on GitHub repository at https://github.com/JJ-Vice/BAGM and the 36 models are on Hugging Face (LINK)
Videos demonstrating how to use Try-Before-You-Bias uploaded to YouTube as Part 1, Part 2, & Part 3. Now code also available on GitHub repository https://github.com/JJ-Vice/TryBeforeYouBias
Demo (and source code) for quantifying bias in text to image generative models released on Hugging Face (Link).
BAGM: A Backdoor Attack for Manipulating Text-to-Image Generative Models. Paper on arXiv and related data (images and backdoor injected models will be released soon). Marketable Foods Dataset here. Send an email to ajmal.mian@uwa.edu.au if you require early access to the backdoored models.
Odysseus Dataset (Link): Contains 1634 clean models and 1642 models with backdoors (using different triggers). Model architectures include Resent18, VGG19, Densenet, GoogleNet and 4 custom designed architectures. Training is done on CIFAR10, Fashion-MNIST and MNIST datasets.