CrewAI
Add a safety layer to your CrewAI agents. Every external action your crew takes — sending emails, replying to tickets, issuing refunds — is verified against your knowledge base and evaluated by your policies before it reaches the outside world.
How the integration works
- Install the
groundtruth-aipackage from PyPI. - Add GroundTruthExecuteTool, GroundTruthAwaitApprovalTool, and/or GroundTruthVerifyTool to your agents.
- Your agents call the tool before any external action. GroundTruth returns approved, blocked, or escalated.
Try it live
Run a real CrewAI agent, edit its response, and see GroundTruth approve, block, or escalate — all in the browser.
Installation
pip install groundtruth-ai[crewai]Set your API key as an environment variable:
export GROUNDTRUTH_API_KEY="hg_sk_your_api_key"Tools Reference
GroundTruthExecuteTool
Routes any external action through GroundTruth's safety layer. Verifies content against your knowledge base and evaluates your policies.
| Config | Description |
|---|---|
| api_key | API key. Falls back to GROUNDTRUTH_API_KEY env var. |
| agent_name | Identifies this agent in the execution log. |
| user_id | Who the agent acts on behalf of. |
| base_url | Defaults to https://app.groundtruth.dev. |
Agent provides: action, params (JSON string), content (optional), channel (optional).
GroundTruthVerifyTool
Fact-checks AI-generated text against your knowledge base. Returns a risk score, supported/unsupported claims, and a safe rewrite if issues are found.
| Config | Description |
|---|---|
| api_key | API key. Falls back to GROUNDTRUTH_API_KEY env var. |
| base_url | Defaults to https://app.groundtruth.dev. |
Agent provides: answer (text to verify), question (optional, improves accuracy).
GroundTruthAwaitApprovalTool
Waits for a human to approve or reject an escalated action. Use this after GroundTruthExecuteTool returns an ESCALATED decision. The tool polls with exponential backoff (2s → 30s cap), reusing the same HTTP connection across polls to avoid exhausting connections during long reviews. Default timeout is 1 hour.
| Config | Description |
|---|---|
| api_key | API key. Falls back to GROUNDTRUTH_API_KEY env var. |
| base_url | Defaults to https://app.groundtruth.dev. |
Agent provides: execution_id (required), timeout (default 3600s), poll_interval (default 2s), max_poll_interval (default 30s).
Customer Support Crew
A crew of AI agents that handles customer support tickets. One agent drafts replies, another verifies them, and all outbound messages go through GroundTruth before reaching customers.
The Crew
Triage Agent
Reads incoming tickets and categorizes them (billing, technical, general).
Reply Agent
Drafts replies using company knowledge. Has the Execute and Verify tools.
Escalation Agent
Handles blocked/escalated actions. Has the Await Approval tool to wait for human decisions.
Policies in GroundTruth
Block unverified replies
reply_ticket · Risk score > 50%
Escalate refund mentions
reply_ticket · Content matches "refund|credit|compensation"
Block after hours
reply_ticket, send_email · Outside 8:00 AM – 8:00 PM
Code
from crewai import Agent, Task, Crew
from groundtruth.crewai import (
GroundTruthExecuteTool,
GroundTruthVerifyTool,
GroundTruthAwaitApprovalTool,
)
# Tools
execute = GroundTruthExecuteTool(agent_name="support-crew")
verify = GroundTruthVerifyTool()
await_approval = GroundTruthAwaitApprovalTool()
# Agents
triage = Agent(
role="Support Triage",
goal="Categorize incoming tickets by type and urgency",
backstory="You triage support tickets for a SaaS company.",
)
replier = Agent(
role="Support Reply Agent",
goal="Draft accurate, helpful replies to customer tickets",
backstory=(
"You draft replies to support tickets. Before sending ANY reply, "
"you MUST use the GroundTruth Verify tool to check your draft, "
"then use the GroundTruth Execute tool to send it. Never send "
"a reply without going through GroundTruth first."
),
tools=[execute, verify],
)
escalation = Agent(
role="Escalation Handler",
goal="Handle blocked or escalated actions and wait for human decisions",
backstory=(
"You handle cases where GroundTruth escalates an action. "
"Use GroundTruth Await Approval with the execution ID to wait "
"for the human reviewer's decision, then report the outcome."
),
tools=[await_approval],
)
# Tasks
triage_task = Task(
description="Categorize ticket #4521: 'I want to cancel my subscription and get a refund.'",
expected_output="Category and urgency level.",
agent=triage,
)
reply_task = Task(
description=(
"Draft and send a reply to ticket #4521 about the cancellation request. "
"First verify your draft with GroundTruth Verify, then send it via "
"GroundTruth Execute with action='reply_ticket'."
),
expected_output="The GroundTruth execution result (approved, blocked, or escalated).",
agent=replier,
context=[triage_task],
)
await_task = Task(
description=(
"If the previous reply was ESCALATED, use GroundTruth Await Approval "
"with the execution ID to wait for the human decision. "
"If approved, confirm to the customer. If rejected, report the rejection."
),
expected_output="The final approval decision and next steps.",
agent=escalation,
context=[reply_task],
)
crew = Crew(
agents=[triage, replier, escalation],
tasks=[triage_task, reply_task, await_task],
)
result = crew.kickoff()
print(result)What happens at runtime
Scenario A: Reply is accurate
The Reply Agent drafts: "You can cancel with 30 days written notice per our terms." Verify tool returns risk 5%. Execute tool sends it via reply_ticket. No policies match. Approved — reply sent.
Scenario B: Reply contains wrong info
The Reply Agent drafts: "You can cancel anytime with a full refund." Verify tool returns risk 72%. Policy "Block unverified replies" triggers because risk_score > 0.5. Blocked — safe rewrite provided. The agent retries with the corrected text.
Scenario C: Reply mentions refunds
The Reply Agent drafts a correct reply that mentions "refund." Content is accurate (risk 8%), but policy "Escalate refund mentions" triggers because content matches "refund". Escalated — queued for human review. The Escalation Agent calls GroundTruth Await Approval with the execution ID and waits.
Scenario C (cont.): Manager approves
A manager approves in the dashboard. The Await Approval tool returns APPROVED. The Escalation Agent confirms to the customer that the reply was sent.
Execution log after a batch of tickets
Sales Outreach Crew
A crew that researches prospects and sends personalized outreach emails. GroundTruth ensures no email goes out with wrong pricing, competitor mentions, or unapproved claims.
The Crew
Researcher
Researches the prospect's company, role, and pain points.
Copywriter
Drafts a personalized email. Has the Execute and Verify tools.
QA Reviewer
Reviews blocked/escalated emails. Has the Verify and Await Approval tools.
Policies in GroundTruth
Block unverified claims
send_email · Risk score > 40%
Review pricing mentions
send_email · Content matches "$\d+"
No competitor mentions
send_email · Content matches competitor names
Rate limit: 20 emails/hour
send_email · Rate > 20 per hour
Code
from crewai import Agent, Task, Crew
from groundtruth.crewai import (
GroundTruthExecuteTool,
GroundTruthVerifyTool,
GroundTruthAwaitApprovalTool,
)
execute = GroundTruthExecuteTool(agent_name="sales-crew", user_id="sales-team")
verify = GroundTruthVerifyTool()
await_approval = GroundTruthAwaitApprovalTool()
researcher = Agent(
role="Prospect Researcher",
goal="Research prospects and identify relevant pain points",
backstory="You research companies to find the best angle for outreach.",
)
copywriter = Agent(
role="Sales Copywriter",
goal="Write personalized, accurate outreach emails",
backstory=(
"You write cold outreach emails. Before sending ANY email, you MUST: "
"1) Use GroundTruth Verify to check all claims about your product. "
"2) Use GroundTruth Execute with action='send_email' to send it. "
"NEVER mention competitor names. NEVER fabricate product features."
),
tools=[execute, verify],
)
qa = Agent(
role="QA Reviewer",
goal="Handle escalated emails and wait for human approval",
backstory=(
"You handle emails that GroundTruth escalated for review. "
"Use GroundTruth Await Approval with the execution ID to wait "
"for the manager's decision, then report the outcome."
),
tools=[verify, await_approval],
)
research_task = Task(
description="Research Acme Corp — find their industry, size, and likely pain points.",
expected_output="A brief prospect profile.",
agent=researcher,
)
email_task = Task(
description=(
"Draft and send a personalized outreach email to sarah@acme.com. "
"Reference our product's actual features from the knowledge base. "
"Verify the draft first, then send via GroundTruth Execute."
),
expected_output="The execution result from GroundTruth.",
agent=copywriter,
context=[research_task],
)
approval_task = Task(
description=(
"If the email was ESCALATED, use GroundTruth Await Approval with the "
"execution ID to wait for the manager's decision. "
"Report whether the email was approved and sent, or rejected."
),
expected_output="The final approval decision.",
agent=qa,
context=[email_task],
)
crew = Crew(
agents=[researcher, copywriter, qa],
tasks=[research_task, email_task, approval_task],
)
result = crew.kickoff()What happens at runtime
Attempt 1: Copywriter mentions a competitor
Draft includes "Unlike Acme Consulting, we offer..." Policy "No competitor mentions" triggers because content matches competitor names. Blocked. The agent sees the block reason and rewrites without the competitor reference.
Attempt 2: Email mentions pricing — escalated
Rewritten draft includes "Starting at $99/month." Content is accurate (risk 6%), but policy "Review pricing mentions" triggers because content matches "$\d+". Escalated — queued for human review.
QA Reviewer waits for approval
The QA Reviewer calls GroundTruth Await Approval with the execution ID. The tool polls with exponential backoff (2s → 30s) while the sales manager reviews in the dashboard. Connections are reused across polls.
Result: Manager approves — email delivered
The sales manager approves in the GroundTruth dashboard. The Await Approval tool returns APPROVED. The email is sent via the Resend connector. The QA Reviewer confirms the email was delivered.
Execution log for a batch outreach run
7 emails attempted. 5 approved, 1 escalated (QA Reviewer waited → approved by manager), 1 blocked (retried with rewrite). Zero wrong claims sent to prospects.
Workflow Resumption After Escalation
When an action is escalated, the agent can use GroundTruthAwaitApprovalTool to pause and wait for the human reviewer's decision, then resume automatically. This enables multi-step workflows that survive escalation without dead-ending.
How it works
- Agent calls
GroundTruth Execute— gets ESCALATED - Agent calls
GroundTruth Await Approvalwith the execution ID - Tool polls
GET /api/approvals/{id}with exponential backoff (2s → 30s cap) until resolved or timeout (default 1 hour) - On APPROVED — agent proceeds with the next step
- On REJECTED — agent revises or informs the user
- On TIMEOUT — agent can retry later or inform the user
Code
from crewai import Agent, Task, Crew
from groundtruth.crewai import GroundTruthExecuteTool, GroundTruthAwaitApprovalTool
execute = GroundTruthExecuteTool(agent_name="support-crew", session_id="sess_123")
await_approval = GroundTruthAwaitApprovalTool()
agent = Agent(
role="Support Agent",
goal="Handle tickets with human oversight for sensitive actions",
backstory=(
"You handle support tickets. Submit actions via GroundTruth Execute. "
"If an action is ESCALATED, use GroundTruth Await Approval with the "
"execution ID and wait for the human decision before proceeding."
),
tools=[execute, await_approval],
)
submit_task = Task(
description=(
"Reply to ticket #7890 about the customer's refund request. "
"Use GroundTruth Execute with action='reply_ticket'."
),
expected_output="The execution result.",
agent=agent,
)
followup_task = Task(
description=(
"If the previous action was ESCALATED, use GroundTruth Await Approval "
"with the execution ID to wait for the human decision. "
"If approved, confirm to the customer. If rejected, revise the reply."
),
expected_output="Final status after human review.",
agent=agent,
context=[submit_task],
)
crew = Crew(agents=[agent], tasks=[submit_task, followup_task])
crew.kickoff()What happens at runtime
Step 1: Action escalated
The agent submits a reply mentioning a refund. Policy "Escalate refund mentions" triggers. Escalated — execution ID returned.
Step 2: Agent waits
The agent calls GroundTruth Await Approval with the execution ID. The tool polls with exponential backoff (2s → 30s), reusing the HTTP connection, while a human reviews in the dashboard.
Step 3: Approved and resumed
A manager approves the action. The tool returns APPROVED. The agent proceeds to send a confirmation to the customer.
When your org has a webhook URL configured, GroundTruth also fires execution.approved and execution.rejected events when a human reviews an approval, so external orchestrators can react without polling.
Advanced: Using the Client Directly
If you need programmatic access outside of CrewAI agents (e.g., in custom callbacks or middleware), use the client directly:
from groundtruth import GroundTruthClient
client = GroundTruthClient(api_key="hg_sk_...")
# Execute an action
result = client.execute(
action="send_email",
params={"to": "prospect@acme.com", "subject": "Hello", "body": "..."},
content="Your email text here for verification.",
channel="email",
agent="my-script",
)
print(result["decision"]) # "approved" | "blocked" | "escalated"
# Verify text without executing
check = client.verify(answer="Our platform handles 10M requests per day.")
print(check["riskScore"])
# List and review approvals
approvals = client.list_approvals()
client.review_approval("exec_789", "approve")
# Poll approval status (used by AwaitApprovalTool internally)
status = client.get_approval_status("exec_789")
print(status["approval"]["status"]) # "pending" | "approved" | "rejected"