A Comprehensive Benchmark for Assessing LLM Robustness Against Fraud and Phishing Inducement

Abstract

We introduce Fraud-R1, a benchmark designed to evaluate LLMs’ ability to defend against internet fraud and phishing in dynamic, real-world scenarios. Fraud-R1 comprises 8,564 fraud cases sourced from phishing scams, fake job postings, social media, and news, categorized into 5 major fraud types. Unlike previous benchmarks, Fraud-R1 introduces a multi-round evaluation pipeline to assess LLMs’ resistance to fraud at different stages, including credibility building, urgency creation, and emotional manipulation. Furthermore, we evaluate 15 LLMs under two settings: (i) Helpful-Assistant, where the LLM provides general decision-making assistance, and (ii) Role-play, where the model assumes a specific persona, widely used in real-world agent-based interactions. Our evaluation reveals the significant challenges in defending against fraud and phishing inducement, especially in role-play settings and fake job postings. Additionally, we observe a substantial performance gap between Chinese and English, underscoring the need for improved multilingual fraud detection capabilities.

Evaluation Flow

An overview of Fraud-R1 evaluation flow. We evaluate LLMs' robustness in defense of fraud inducement under two different settings: Helpful-Assistant and Role-play settings.

Data Statistic

Key statistics of FP-base and FP-levelup, where the “Levelup” dataset is rule-based augmented from the Base dataset.

Augmented Data Types

The step-by-step augmented fraud of 4 levels, including Base, Building Credibility, Creating Urgency, Exploiting Emotional Appeal.

Data Construction and Augmentation Pipeline

Leaderboard

Performance Across Two Tasks

The detailed DSR(%) on 15 models. Bold values indicate the highest score in each column within API-based or Open-weight models, and underlined values represent the second highest score within the same category. "OD" stands for the overall DSR of models. "AS" and "RP" represent the model performance on Helpful Assistant and Role-play tasks, respectively.

BibTeX

@misc{yang2025fraudr1multiroundbenchmark,
      title={Fraud-R1 : A Multi-Round Benchmark for Assessing the Robustness of LLM Against Augmented Fraud and Phishing Inducements}, 
      author={Shu Yang and Shenzhe Zhu and Zeyu Wu and Keyu Wang and Junchi Yao and Junchao Wu and Lijie Hu and Mengdi Li and Derek F. Wong and Di Wang},
      year={2025},
      eprint={2502.12904},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2502.12904}, 
}