Agentic data, captured at the source
The hardest part of training agents isn't the model — it's the data. klavis turns live environment runs into dense, reward-labeled trajectories you can train on directly.
Full trajectories
Every observation, action, tool call, and reward, time-aligned across the entire episode — not just final answers.
Reward signals
Verifiable, environment-grounded outcomes plus optional process rewards for fine-grained credit assignment.
Long-horizon coverage
Multi-step, multi-hour tasks with recovery from failure, backtracking, and real tool latency baked in.
RL-ready formats
Export to standard schemas for SFT, DPO, and online/offline RL. Stream live or pull versioned datasets.
Capture
Agents act in live environments; every step is logged with full state.
Label
Outcomes are scored against environment-grounded success criteria.
Export
Clean, versioned trajectories delivered to your training pipeline.
Build agents that operate in the real world
Access is whitelist-only. Tell us about your lab and we'll get you set up with environments and data.