Datasets Tutorial
This quick-start tutorial covers only the basics you need to start using datasets in HackAgent. Presets are pre-configured benchmark datasets. They are the fastest way to run standardized evaluations.
If you want to select goals by risk taxonomy (OmniSafeBench) instead of full datasets,
you can use intents with categories/subcategories. See
Selecting intent categories for details.
Basic CLI Example
hackagent eval baseline \
--agent-name "target_agent" \
--agent-type "google-adk" \
--endpoint "http://localhost:8000" \
--config-file "configs/baseline-agentharm.json" \
--no-tui
Basic SDK Example
from hackagent import HackAgent, AgentTypeEnum
agent = HackAgent(
name="target_agent",
endpoint="http://localhost:8000",
agent_type=AgentTypeEnum.GOOGLE_ADK,
)
attack_config = {
"attack_type": "baseline",
"dataset": {
"preset": "agentharm",
"limit": 50,
"shuffle": True,
"seed": 42,
},
}
results = agent.hack(attack_config=attack_config)
Popular Presets
| Preset | Description |
|---|---|
agentharm | Harmful agentic tasks |
jailbreakbench | Curated jailbreak behaviors |
strongreject | Forbidden jailbreak prompts |
beavertails | Multi-category safety evaluation |
simplesafetytests | Fast safety sanity checks |
For the complete list, see Presets.
Dataset Options
These are the core options supported across dataset sources.
| Option | Type | Default | Purpose |
|---|---|---|---|
limit | int | None | Maximum number of goals to load |
offset | int | 0 | Skip the first N goals |
shuffle | bool | False | Randomize goal order |
seed | int | None | Make randomized selection reproducible |
Minimal Example
attack_config = {
"attack_type": "baseline",
"dataset": {
"preset": "strongreject",
"limit": 100,
"offset": 0,
"shuffle": True,
"seed": 42,
},
}
Basic Guidance
- Use
limitto keep tests small while iterating. - Use
offsetto evaluate different slices of large datasets. - Use
shufflefor broader sample diversity. - Use
seedwhen you need reproducible runs.
tip
If shuffle and offset are both set, shuffling happens first and offset is applied after.
Learn More
- Dataset Providers for the overview.
- Presets for the full benchmark catalog.
- HuggingFace Provider for external datasets.
- File Provider for local JSON/CSV/TXT inputs.
- Custom Providers for custom data sources.
- Troubleshooting for common dataset issues.