Skip to main content

Selecting intent categories

Use intents when you want to build attack goals from the OmniSafeBench risk taxonomy instead of manually providing goals or selecting a full dataset provider.

This is useful when you want to:

  • target specific risk families
  • keep labels consistent in dashboard/results metadata
  • skip category-classifier preflight when labels are explicitly provided

Citation

The intent taxonomy used in this page is based on OmniSafeBench-MM:

Three ways to write intents config

1. Full labels as strings

attack_config = {
"attack_type": "h4rm3l",
"intents": [
{
"category": "Ethical and Social Risks",
"subcategories": [
"Bias and Discrimination",
"Insulting or Harassing Speech",
],
"samples_per_subcategory": 2,
},
{
"category": "Decision and Cognitive Risks",
"subcategories": ["Medical Advice"],
"samples_per_subcategory": 2,
},
],
}

2. Enums

from hackagent.datasets import IntentCategory, IntentSubcategory

attack_config = {
"attack_type": "h4rm3l",
"intents": [
{
"category": IntentCategory.ETHICAL_AND_SOCIAL_RISKS,
"subcategories": [
IntentSubcategory.BIAS_AND_DISCRIMINATION,
IntentSubcategory.INSULTING_OR_HARASSING_SPEECH,
],
"samples_per_subcategory": 2,
},
{
"category": IntentCategory.DECISION_AND_COGNITIVE_RISKS,
"subcategories": [IntentSubcategory.MEDICAL_ADVICE],
"samples_per_subcategory": 2,
},
],
}

3. Label codes as strings

attack_config = {
"attack_type": "h4rm3l",
"intents": [
{
"category": "A",
"subcategories": ["A1", "A2"],
"samples_per_subcategory": 2,
},
{
"category": "I",
"subcategories": ["I1"],
"samples_per_subcategory": 2,
},
],
}

Default behavior for omitted fields

When some fields are omitted in an intents entry, HackAgent applies the following defaults:

  • If subcategories is not provided, all subcategories of the selected category are used.
  • If samples_per_subcategory is not provided, the default is 1 sample for each selected subcategory.
  • Therefore, if both are omitted, HackAgent selects 1 sample for all subcategories in the selected category.

Example:

attack_config = {
"attack_type": "h4rm3l",
"intents": [
{
"category": "A",
# subcategories omitted -> A1..A4
# samples_per_subcategory omitted -> 1 each
}
],
}

Complete taxonomy map (code, label, enum ID)

Source taxonomy and intents list: OmniSafeBench-MM.

CodeNameEnum IDSamples
AEthical and Social RisksIntentCategory.ETHICAL_AND_SOCIAL_RISKS988
A1Bias and DiscriminationIntentSubcategory.BIAS_AND_DISCRIMINATION434
A2Insulting or Harassing SpeechIntentSubcategory.INSULTING_OR_HARASSING_SPEECH113
A3Hate SpeechIntentSubcategory.HATE_SPEECH411
A4Discriminatory BehaviorIntentSubcategory.DISCRIMINATORY_BEHAVIOR30
BPrivacy and Data RisksIntentCategory.PRIVACY_AND_DATA_RISKS504
B1Personal Data LeakageIntentSubcategory.PERSONAL_DATA_LEAKAGE250
B2Biometric or Health Data ExposureIntentSubcategory.BIOMETRIC_OR_HEALTH_DATA_EXPOSURE30
B3Doxxing or TrackingIntentSubcategory.DOXXING_OR_TRACKING164
B4Surveillance MisuseIntentSubcategory.SURVEILLANCE_MISUSE30
B5Training Data LeakageIntentSubcategory.TRAINING_DATA_LEAKAGE30
CSafety and Physical HarmIntentCategory.SAFETY_AND_PHYSICAL_HARM877
C1Dangerous InstructionsIntentSubcategory.DANGEROUS_INSTRUCTIONS29
C3Physical HarmIntentSubcategory.PHYSICAL_HARM599
C4Human TraffickingIntentSubcategory.HUMAN_TRAFFICKING30
C2Self-harm PromotionIntentSubcategory.SELF_HARM_PROMOTION30
C5Animal AbuseIntentSubcategory.ANIMAL_ABUSE132
C6Drug GuidanceIntentSubcategory.DRUG_GUIDANCE29
C7Environmental DamageIntentSubcategory.ENVIRONMENTAL_DAMAGE28
DCriminal and Economic RisksIntentCategory.CRIMINAL_AND_ECONOMIC_RISKS1017
D1Fraud or ScamsIntentSubcategory.FRAUD_OR_SCAMS367
D2Economic CrimeIntentSubcategory.ECONOMIC_CRIME245
D3White-Collar CrimeIntentSubcategory.WHITE_COLLAR_CRIME30
D4Illegal ActivityIntentSubcategory.ILLEGAL_ACTIVITY345
D5Evasion of LawIntentSubcategory.EVASION_OF_LAW30
ECybersecurity ThreatsIntentCategory.CYBERSECURITY_THREATS725
E1Malware GenerationIntentSubcategory.MALWARE_GENERATION275
E2Exploit DevelopmentIntentSubcategory.EXPLOIT_DEVELOPMENT201
E3Phishing or RansomwareIntentSubcategory.PHISHING_OR_RANSOMWARE30
E4Jailbreak or Prompt InjectionIntentSubcategory.JAILBREAK_OR_PROMPT_INJECTION30
E5Cross-modal AttackIntentSubcategory.CROSS_MODAL_ATTACK30
E6Data PoisoningIntentSubcategory.DATA_POISONING30
E7Unauthorized Model AccessIntentSubcategory.UNAUTHORIZED_MODEL_ACCESS129
FInformation and Political ManipulationIntentCategory.INFORMATION_AND_POLITICAL_MANIPULATION534
F1Misinformation or DisinformationIntentSubcategory.MISINFORMATION_OR_DISINFORMATION30
F2Deepfakes or Synthetic MediaIntentSubcategory.DEEPFAKES_OR_SYNTHETIC_MEDIA30
F3Political PropagandaIntentSubcategory.POLITICAL_PROPAGANDA384
F4Election InterferenceIntentSubcategory.ELECTION_INTERFERENCE30
F5Public Order DisruptionIntentSubcategory.PUBLIC_ORDER_DISRUPTION30
F6National Security ThreatIntentSubcategory.NATIONAL_SECURITY_THREAT30
GContent and Cultural SafetyIntentCategory.CONTENT_AND_CULTURAL_SAFETY537
G1Pornography or NudityIntentSubcategory.PORNOGRAPHY_OR_NUDITY118
G2Sexual ContentIntentSubcategory.SEXUAL_CONTENT137
G3Violence or GoreIntentSubcategory.VIOLENCE_OR_GORE152
G4Cultural or Religious OffenseIntentSubcategory.CULTURAL_OR_RELIGIOUS_OFFENSE130
HIntellectual Property and OwnershipIntentCategory.INTELLECTUAL_PROPERTY_AND_OWNERSHIP304
H1Copyright InfringementIntentSubcategory.COPYRIGHT_INFRINGEMENT117
H2Trademark or Brand MisuseIntentSubcategory.TRADEMARK_OR_BRAND_MISUSE30
H3Proprietary Data LeakageIntentSubcategory.PROPRIETARY_DATA_LEAKAGE127
H4Unauthorized Style or Persona ImitationIntentSubcategory.UNAUTHORIZED_STYLE_OR_PERSONA_IMITATION30
IDecision and Cognitive RisksIntentCategory.DECISION_AND_COGNITIVE_RISKS1593
I1Medical AdviceIntentSubcategory.MEDICAL_ADVICE321
I2Health MisguidanceIntentSubcategory.HEALTH_MISGUIDANCE128
I3Legal AdviceIntentSubcategory.LEGAL_ADVICE372
I4Government Decision SupportIntentSubcategory.GOVERNMENT_DECISION_SUPPORT308
I5Financial AdviceIntentSubcategory.FINANCIAL_ADVICE290
I6Market ManipulationIntentSubcategory.MARKET_MANIPULATION30
I7Psychological ManipulationIntentSubcategory.PSYCHOLOGICAL_MANIPULATION114
I8Cognitive Bias or OverrelianceIntentSubcategory.COGNITIVE_BIAS_OR_OVERRELIANCE30
JChild SafetyIntentCategory.CHILD_SAFETY747
J1CSAM & SexualizationIntentSubcategory.CSAM_SEXUALIZATION171
J2Grooming & EnticementIntentSubcategory.GROOMING_ENTICEMENT147
J3Child TraffickingIntentSubcategory.CHILD_TRAFFICKING144
J4Harmful Content Targeting MinorsIntentSubcategory.HARMFUL_CONTENT_TARGETING_MINORS161
J5Age Verification EvasionIntentSubcategory.AGE_VERIFICATION_EVASION124