Selecting intent categories

Use intents when you want to build attack goals from the OmniSafeBench risk taxonomy instead of manually providing goals or selecting a full dataset provider.

This is useful when you want to:

target specific risk families
keep labels consistent in dashboard/results metadata
skip category-classifier preflight when labels are explicitly provided

Citation

The intent taxonomy used in this page is based on OmniSafeBench-MM:

OmniSafeBench-MM repository: jiaxiaojunQAQ/OmniSafeBench-MM

Three ways to write intents config

1. Full labels as strings

attack_config = {
    "attack_type": "h4rm3l",
    "intents": [
        {
            "category": "Ethical and Social Risks",
            "subcategories": [
                "Bias and Discrimination",
                "Insulting or Harassing Speech",
            ],
            "samples_per_subcategory": 2,
        },
        {
            "category": "Decision and Cognitive Risks",
            "subcategories": ["Medical Advice"],
            "samples_per_subcategory": 2,
        },
    ],
}

2. Enums

from hackagent.datasets import IntentCategory, IntentSubcategory

attack_config = {
    "attack_type": "h4rm3l",
    "intents": [
        {
            "category": IntentCategory.ETHICAL_AND_SOCIAL_RISKS,
            "subcategories": [
                IntentSubcategory.BIAS_AND_DISCRIMINATION,
                IntentSubcategory.INSULTING_OR_HARASSING_SPEECH,
            ],
            "samples_per_subcategory": 2,
        },
        {
            "category": IntentCategory.DECISION_AND_COGNITIVE_RISKS,
            "subcategories": [IntentSubcategory.MEDICAL_ADVICE],
            "samples_per_subcategory": 2,
        },
    ],
}

3. Label codes as strings

attack_config = {
    "attack_type": "h4rm3l",
    "intents": [
        {
            "category": "A",
            "subcategories": ["A1", "A2"],
            "samples_per_subcategory": 2,
        },
        {
            "category": "I",
            "subcategories": ["I1"],
            "samples_per_subcategory": 2,
        },
    ],
}

Default behavior for omitted fields

When some fields are omitted in an intents entry, HackAgent applies the following defaults:

If subcategories is not provided, all subcategories of the selected category are used.
If samples_per_subcategory is not provided, the default is 1 sample for each selected subcategory.
Therefore, if both are omitted, HackAgent selects 1 sample for all subcategories in the selected category.

Example:

attack_config = {
        "attack_type": "h4rm3l",
        "intents": [
                {
                        "category": "A",
                        # subcategories omitted -> A1..A4
                        # samples_per_subcategory omitted -> 1 each
                }
        ],
}

Complete taxonomy map (code, label, enum ID)

Source taxonomy and intents list: OmniSafeBench-MM.

Code	Name	Enum ID	Samples
A	Ethical and Social Risks	IntentCategory.ETHICAL_AND_SOCIAL_RISKS	988
A1	Bias and Discrimination	IntentSubcategory.BIAS_AND_DISCRIMINATION	434
A2	Insulting or Harassing Speech	IntentSubcategory.INSULTING_OR_HARASSING_SPEECH	113
A3	Hate Speech	IntentSubcategory.HATE_SPEECH	411
A4	Discriminatory Behavior	IntentSubcategory.DISCRIMINATORY_BEHAVIOR	30
B	Privacy and Data Risks	IntentCategory.PRIVACY_AND_DATA_RISKS	504
B1	Personal Data Leakage	IntentSubcategory.PERSONAL_DATA_LEAKAGE	250
B2	Biometric or Health Data Exposure	IntentSubcategory.BIOMETRIC_OR_HEALTH_DATA_EXPOSURE	30
B3	Doxxing or Tracking	IntentSubcategory.DOXXING_OR_TRACKING	164
B4	Surveillance Misuse	IntentSubcategory.SURVEILLANCE_MISUSE	30
B5	Training Data Leakage	IntentSubcategory.TRAINING_DATA_LEAKAGE	30
C	Safety and Physical Harm	IntentCategory.SAFETY_AND_PHYSICAL_HARM	877
C1	Dangerous Instructions	IntentSubcategory.DANGEROUS_INSTRUCTIONS	29
C3	Physical Harm	IntentSubcategory.PHYSICAL_HARM	599
C4	Human Trafficking	IntentSubcategory.HUMAN_TRAFFICKING	30
C2	Self-harm Promotion	IntentSubcategory.SELF_HARM_PROMOTION	30
C5	Animal Abuse	IntentSubcategory.ANIMAL_ABUSE	132
C6	Drug Guidance	IntentSubcategory.DRUG_GUIDANCE	29
C7	Environmental Damage	IntentSubcategory.ENVIRONMENTAL_DAMAGE	28
D	Criminal and Economic Risks	IntentCategory.CRIMINAL_AND_ECONOMIC_RISKS	1017
D1	Fraud or Scams	IntentSubcategory.FRAUD_OR_SCAMS	367
D2	Economic Crime	IntentSubcategory.ECONOMIC_CRIME	245
D3	White-Collar Crime	IntentSubcategory.WHITE_COLLAR_CRIME	30
D4	Illegal Activity	IntentSubcategory.ILLEGAL_ACTIVITY	345
D5	Evasion of Law	IntentSubcategory.EVASION_OF_LAW	30
E	Cybersecurity Threats	IntentCategory.CYBERSECURITY_THREATS	725
E1	Malware Generation	IntentSubcategory.MALWARE_GENERATION	275
E2	Exploit Development	IntentSubcategory.EXPLOIT_DEVELOPMENT	201
E3	Phishing or Ransomware	IntentSubcategory.PHISHING_OR_RANSOMWARE	30
E4	Jailbreak or Prompt Injection	IntentSubcategory.JAILBREAK_OR_PROMPT_INJECTION	30
E5	Cross-modal Attack	IntentSubcategory.CROSS_MODAL_ATTACK	30
E6	Data Poisoning	IntentSubcategory.DATA_POISONING	30
E7	Unauthorized Model Access	IntentSubcategory.UNAUTHORIZED_MODEL_ACCESS	129
F	Information and Political Manipulation	IntentCategory.INFORMATION_AND_POLITICAL_MANIPULATION	534
F1	Misinformation or Disinformation	IntentSubcategory.MISINFORMATION_OR_DISINFORMATION	30
F2	Deepfakes or Synthetic Media	IntentSubcategory.DEEPFAKES_OR_SYNTHETIC_MEDIA	30
F3	Political Propaganda	IntentSubcategory.POLITICAL_PROPAGANDA	384
F4	Election Interference	IntentSubcategory.ELECTION_INTERFERENCE	30
F5	Public Order Disruption	IntentSubcategory.PUBLIC_ORDER_DISRUPTION	30
F6	National Security Threat	IntentSubcategory.NATIONAL_SECURITY_THREAT	30
G	Content and Cultural Safety	IntentCategory.CONTENT_AND_CULTURAL_SAFETY	537
G1	Pornography or Nudity	IntentSubcategory.PORNOGRAPHY_OR_NUDITY	118
G2	Sexual Content	IntentSubcategory.SEXUAL_CONTENT	137
G3	Violence or Gore	IntentSubcategory.VIOLENCE_OR_GORE	152
G4	Cultural or Religious Offense	IntentSubcategory.CULTURAL_OR_RELIGIOUS_OFFENSE	130
H	Intellectual Property and Ownership	IntentCategory.INTELLECTUAL_PROPERTY_AND_OWNERSHIP	304
H1	Copyright Infringement	IntentSubcategory.COPYRIGHT_INFRINGEMENT	117
H2	Trademark or Brand Misuse	IntentSubcategory.TRADEMARK_OR_BRAND_MISUSE	30
H3	Proprietary Data Leakage	IntentSubcategory.PROPRIETARY_DATA_LEAKAGE	127
H4	Unauthorized Style or Persona Imitation	IntentSubcategory.UNAUTHORIZED_STYLE_OR_PERSONA_IMITATION	30
I	Decision and Cognitive Risks	IntentCategory.DECISION_AND_COGNITIVE_RISKS	1593
I1	Medical Advice	IntentSubcategory.MEDICAL_ADVICE	321
I2	Health Misguidance	IntentSubcategory.HEALTH_MISGUIDANCE	128
I3	Legal Advice	IntentSubcategory.LEGAL_ADVICE	372
I4	Government Decision Support	IntentSubcategory.GOVERNMENT_DECISION_SUPPORT	308
I5	Financial Advice	IntentSubcategory.FINANCIAL_ADVICE	290
I6	Market Manipulation	IntentSubcategory.MARKET_MANIPULATION	30
I7	Psychological Manipulation	IntentSubcategory.PSYCHOLOGICAL_MANIPULATION	114
I8	Cognitive Bias or Overreliance	IntentSubcategory.COGNITIVE_BIAS_OR_OVERRELIANCE	30
J	Child Safety	IntentCategory.CHILD_SAFETY	747
J1	CSAM & Sexualization	IntentSubcategory.CSAM_SEXUALIZATION	171
J2	Grooming & Enticement	IntentSubcategory.GROOMING_ENTICEMENT	147
J3	Child Trafficking	IntentSubcategory.CHILD_TRAFFICKING	144
J4	Harmful Content Targeting Minors	IntentSubcategory.HARMFUL_CONTENT_TARGETING_MINORS	161
J5	Age Verification Evasion	IntentSubcategory.AGE_VERIFICATION_EVASION	124

Citation​

Three ways to write intents config​

1. Full labels as strings​

2. Enums​

3. Label codes as strings​

Default behavior for omitted fields​

Complete taxonomy map (code, label, enum ID)​