Back to Docs
Integration

CrewAI

Add a safety layer to your CrewAI agents. Every external action your crew takes — sending emails, replying to tickets, issuing refunds — is verified against your knowledge base and evaluated by your policies before it reaches the outside world.

How the integration works

  1. Install the groundtruth-ai package from PyPI.
  2. Add GroundTruthExecuteTool, GroundTruthAwaitApprovalTool, and/or GroundTruthVerifyTool to your agents.
  3. Your agents call the tool before any external action. GroundTruth returns approved, blocked, or escalated.

Try it live

Run a real CrewAI agent, edit its response, and see GroundTruth approve, block, or escalate — all in the browser.

Open Interactive Demo →

Installation

bash
pip install groundtruth-ai[crewai]

Set your API key as an environment variable:

bash
export GROUNDTRUTH_API_KEY="hg_sk_your_api_key"

Tools Reference

GroundTruthExecuteTool

Routes any external action through GroundTruth's safety layer. Verifies content against your knowledge base and evaluates your policies.

ConfigDescription
api_keyAPI key. Falls back to GROUNDTRUTH_API_KEY env var.
agent_nameIdentifies this agent in the execution log.
user_idWho the agent acts on behalf of.
base_urlDefaults to https://app.groundtruth.dev.

Agent provides: action, params (JSON string), content (optional), channel (optional).

GroundTruthVerifyTool

Fact-checks AI-generated text against your knowledge base. Returns a risk score, supported/unsupported claims, and a safe rewrite if issues are found.

ConfigDescription
api_keyAPI key. Falls back to GROUNDTRUTH_API_KEY env var.
base_urlDefaults to https://app.groundtruth.dev.

Agent provides: answer (text to verify), question (optional, improves accuracy).

GroundTruthAwaitApprovalTool

Waits for a human to approve or reject an escalated action. Use this after GroundTruthExecuteTool returns an ESCALATED decision. The tool polls with exponential backoff (2s → 30s cap), reusing the same HTTP connection across polls to avoid exhausting connections during long reviews. Default timeout is 1 hour.

ConfigDescription
api_keyAPI key. Falls back to GROUNDTRUTH_API_KEY env var.
base_urlDefaults to https://app.groundtruth.dev.

Agent provides: execution_id (required), timeout (default 3600s), poll_interval (default 2s), max_poll_interval (default 30s).

Use Case 1

Customer Support Crew

A crew of AI agents that handles customer support tickets. One agent drafts replies, another verifies them, and all outbound messages go through GroundTruth before reaching customers.

The Crew

Triage Agent

Reads incoming tickets and categorizes them (billing, technical, general).

Reply Agent

Drafts replies using company knowledge. Has the Execute and Verify tools.

Escalation Agent

Handles blocked/escalated actions. Has the Await Approval tool to wait for human decisions.

Policies in GroundTruth

GroundTruth — Policies

Block unverified replies

reply_ticket · Risk score > 50%

P10
block

Escalate refund mentions

reply_ticket · Content matches "refund|credit|compensation"

P8
escalate

Block after hours

reply_ticket, send_email · Outside 8:00 AM – 8:00 PM

P9
block

Code

python
from crewai import Agent, Task, Crew
from groundtruth.crewai import (
    GroundTruthExecuteTool,
    GroundTruthVerifyTool,
    GroundTruthAwaitApprovalTool,
)

# Tools
execute       = GroundTruthExecuteTool(agent_name="support-crew")
verify        = GroundTruthVerifyTool()
await_approval = GroundTruthAwaitApprovalTool()

# Agents
triage = Agent(
    role="Support Triage",
    goal="Categorize incoming tickets by type and urgency",
    backstory="You triage support tickets for a SaaS company.",
)

replier = Agent(
    role="Support Reply Agent",
    goal="Draft accurate, helpful replies to customer tickets",
    backstory=(
        "You draft replies to support tickets. Before sending ANY reply, "
        "you MUST use the GroundTruth Verify tool to check your draft, "
        "then use the GroundTruth Execute tool to send it. Never send "
        "a reply without going through GroundTruth first."
    ),
    tools=[execute, verify],
)

escalation = Agent(
    role="Escalation Handler",
    goal="Handle blocked or escalated actions and wait for human decisions",
    backstory=(
        "You handle cases where GroundTruth escalates an action. "
        "Use GroundTruth Await Approval with the execution ID to wait "
        "for the human reviewer's decision, then report the outcome."
    ),
    tools=[await_approval],
)

# Tasks
triage_task = Task(
    description="Categorize ticket #4521: 'I want to cancel my subscription and get a refund.'",
    expected_output="Category and urgency level.",
    agent=triage,
)

reply_task = Task(
    description=(
        "Draft and send a reply to ticket #4521 about the cancellation request. "
        "First verify your draft with GroundTruth Verify, then send it via "
        "GroundTruth Execute with action='reply_ticket'."
    ),
    expected_output="The GroundTruth execution result (approved, blocked, or escalated).",
    agent=replier,
    context=[triage_task],
)

await_task = Task(
    description=(
        "If the previous reply was ESCALATED, use GroundTruth Await Approval "
        "with the execution ID to wait for the human decision. "
        "If approved, confirm to the customer. If rejected, report the rejection."
    ),
    expected_output="The final approval decision and next steps.",
    agent=escalation,
    context=[reply_task],
)

crew = Crew(
    agents=[triage, replier, escalation],
    tasks=[triage_task, reply_task, await_task],
)
result = crew.kickoff()
print(result)

What happens at runtime

Scenario A: Reply is accurate

The Reply Agent drafts: "You can cancel with 30 days written notice per our terms." Verify tool returns risk 5%. Execute tool sends it via reply_ticket. No policies match. Approved — reply sent.

Scenario B: Reply contains wrong info

The Reply Agent drafts: "You can cancel anytime with a full refund." Verify tool returns risk 72%. Policy "Block unverified replies" triggers because risk_score > 0.5. Blocked — safe rewrite provided. The agent retries with the corrected text.

Scenario C: Reply mentions refunds

The Reply Agent drafts a correct reply that mentions "refund." Content is accurate (risk 8%), but policy "Escalate refund mentions" triggers because content matches "refund". Escalated — queued for human review. The Escalation Agent calls GroundTruth Await Approval with the execution ID and waits.

Scenario C (cont.): Manager approves

A manager approves in the dashboard. The Await Approval tool returns APPROVED. The Escalation Agent confirms to the customer that the reply was sent.

Execution log after a batch of tickets

GroundTruth — Executions
TimeActionChannelDecisionRiskPolicy
9:01 AMreply_ticketzendesk
approved
5%
9:03 AMreply_ticketzendesk
blocked
72%Block unverified replies
9:04 AMreply_ticketzendesk
approved
8%Retried with safe rewrite
9:12 AMreply_ticketzendesk
escalated
8%Escalate refund mentions
9:14 AMreply_ticketzendesk
approved
8%Approved by manager
9:15 AMreply_ticketzendesk
approved
3%
9:22 AMsend_emailemail
approved
10%
Use Case 2

Sales Outreach Crew

A crew that researches prospects and sends personalized outreach emails. GroundTruth ensures no email goes out with wrong pricing, competitor mentions, or unapproved claims.

The Crew

Researcher

Researches the prospect's company, role, and pain points.

Copywriter

Drafts a personalized email. Has the Execute and Verify tools.

QA Reviewer

Reviews blocked/escalated emails. Has the Verify and Await Approval tools.

Policies in GroundTruth

GroundTruth — Policies

Block unverified claims

send_email · Risk score > 40%

P10
block

Review pricing mentions

send_email · Content matches "$\d+"

P8
escalate

No competitor mentions

send_email · Content matches competitor names

P9
block

Rate limit: 20 emails/hour

send_email · Rate > 20 per hour

P7
block

Code

python
from crewai import Agent, Task, Crew
from groundtruth.crewai import (
    GroundTruthExecuteTool,
    GroundTruthVerifyTool,
    GroundTruthAwaitApprovalTool,
)

execute        = GroundTruthExecuteTool(agent_name="sales-crew", user_id="sales-team")
verify         = GroundTruthVerifyTool()
await_approval = GroundTruthAwaitApprovalTool()

researcher = Agent(
    role="Prospect Researcher",
    goal="Research prospects and identify relevant pain points",
    backstory="You research companies to find the best angle for outreach.",
)

copywriter = Agent(
    role="Sales Copywriter",
    goal="Write personalized, accurate outreach emails",
    backstory=(
        "You write cold outreach emails. Before sending ANY email, you MUST: "
        "1) Use GroundTruth Verify to check all claims about your product. "
        "2) Use GroundTruth Execute with action='send_email' to send it. "
        "NEVER mention competitor names. NEVER fabricate product features."
    ),
    tools=[execute, verify],
)

qa = Agent(
    role="QA Reviewer",
    goal="Handle escalated emails and wait for human approval",
    backstory=(
        "You handle emails that GroundTruth escalated for review. "
        "Use GroundTruth Await Approval with the execution ID to wait "
        "for the manager's decision, then report the outcome."
    ),
    tools=[verify, await_approval],
)

research_task = Task(
    description="Research Acme Corp — find their industry, size, and likely pain points.",
    expected_output="A brief prospect profile.",
    agent=researcher,
)

email_task = Task(
    description=(
        "Draft and send a personalized outreach email to sarah@acme.com. "
        "Reference our product's actual features from the knowledge base. "
        "Verify the draft first, then send via GroundTruth Execute."
    ),
    expected_output="The execution result from GroundTruth.",
    agent=copywriter,
    context=[research_task],
)

approval_task = Task(
    description=(
        "If the email was ESCALATED, use GroundTruth Await Approval with the "
        "execution ID to wait for the manager's decision. "
        "Report whether the email was approved and sent, or rejected."
    ),
    expected_output="The final approval decision.",
    agent=qa,
    context=[email_task],
)

crew = Crew(
    agents=[researcher, copywriter, qa],
    tasks=[research_task, email_task, approval_task],
)
result = crew.kickoff()

What happens at runtime

Attempt 1: Copywriter mentions a competitor

Draft includes "Unlike Acme Consulting, we offer..." Policy "No competitor mentions" triggers because content matches competitor names. Blocked. The agent sees the block reason and rewrites without the competitor reference.

Attempt 2: Email mentions pricing — escalated

Rewritten draft includes "Starting at $99/month." Content is accurate (risk 6%), but policy "Review pricing mentions" triggers because content matches "$\d+". Escalated — queued for human review.

QA Reviewer waits for approval

The QA Reviewer calls GroundTruth Await Approval with the execution ID. The tool polls with exponential backoff (2s → 30s) while the sales manager reviews in the dashboard. Connections are reused across polls.

Result: Manager approves — email delivered

The sales manager approves in the GroundTruth dashboard. The Await Approval tool returns APPROVED. The email is sent via the Resend connector. The QA Reviewer confirms the email was delivered.

Execution log for a batch outreach run

GroundTruth — Executions
TimeActionChannelDecisionRiskPolicy
10:00 AMsend_emailemail
blocked
12%No competitor mentions
10:01 AMsend_emailemail
escalated
6%Review pricing mentions
10:05 AMsend_emailemail
approved
6%Approved by manager
10:08 AMsend_emailemail
approved
4%
10:12 AMsend_emailemail
approved
9%
10:15 AMsend_emailemail
blocked
65%Block unverified claims
10:16 AMsend_emailemail
approved
11%Retried with safe rewrite

7 emails attempted. 5 approved, 1 escalated (QA Reviewer waited → approved by manager), 1 blocked (retried with rewrite). Zero wrong claims sent to prospects.

Pattern

Workflow Resumption After Escalation

When an action is escalated, the agent can use GroundTruthAwaitApprovalTool to pause and wait for the human reviewer's decision, then resume automatically. This enables multi-step workflows that survive escalation without dead-ending.

How it works

  1. Agent calls GroundTruth Execute — gets ESCALATED
  2. Agent calls GroundTruth Await Approval with the execution ID
  3. Tool polls GET /api/approvals/{id} with exponential backoff (2s → 30s cap) until resolved or timeout (default 1 hour)
  4. On APPROVED — agent proceeds with the next step
  5. On REJECTED — agent revises or informs the user
  6. On TIMEOUT — agent can retry later or inform the user

Code

python
from crewai import Agent, Task, Crew
from groundtruth.crewai import GroundTruthExecuteTool, GroundTruthAwaitApprovalTool

execute = GroundTruthExecuteTool(agent_name="support-crew", session_id="sess_123")
await_approval = GroundTruthAwaitApprovalTool()

agent = Agent(
    role="Support Agent",
    goal="Handle tickets with human oversight for sensitive actions",
    backstory=(
        "You handle support tickets. Submit actions via GroundTruth Execute. "
        "If an action is ESCALATED, use GroundTruth Await Approval with the "
        "execution ID and wait for the human decision before proceeding."
    ),
    tools=[execute, await_approval],
)

submit_task = Task(
    description=(
        "Reply to ticket #7890 about the customer's refund request. "
        "Use GroundTruth Execute with action='reply_ticket'."
    ),
    expected_output="The execution result.",
    agent=agent,
)

followup_task = Task(
    description=(
        "If the previous action was ESCALATED, use GroundTruth Await Approval "
        "with the execution ID to wait for the human decision. "
        "If approved, confirm to the customer. If rejected, revise the reply."
    ),
    expected_output="Final status after human review.",
    agent=agent,
    context=[submit_task],
)

crew = Crew(agents=[agent], tasks=[submit_task, followup_task])
crew.kickoff()

What happens at runtime

Step 1: Action escalated

The agent submits a reply mentioning a refund. Policy "Escalate refund mentions" triggers. Escalated — execution ID returned.

Step 2: Agent waits

The agent calls GroundTruth Await Approval with the execution ID. The tool polls with exponential backoff (2s → 30s), reusing the HTTP connection, while a human reviews in the dashboard.

Step 3: Approved and resumed

A manager approves the action. The tool returns APPROVED. The agent proceeds to send a confirmation to the customer.

When your org has a webhook URL configured, GroundTruth also fires execution.approved and execution.rejected events when a human reviews an approval, so external orchestrators can react without polling.

Advanced: Using the Client Directly

If you need programmatic access outside of CrewAI agents (e.g., in custom callbacks or middleware), use the client directly:

python
from groundtruth import GroundTruthClient

client = GroundTruthClient(api_key="hg_sk_...")

# Execute an action
result = client.execute(
    action="send_email",
    params={"to": "prospect@acme.com", "subject": "Hello", "body": "..."},
    content="Your email text here for verification.",
    channel="email",
    agent="my-script",
)
print(result["decision"])  # "approved" | "blocked" | "escalated"

# Verify text without executing
check = client.verify(answer="Our platform handles 10M requests per day.")
print(check["riskScore"])

# List and review approvals
approvals = client.list_approvals()
client.review_approval("exec_789", "approve")

# Poll approval status (used by AwaitApprovalTool internally)
status = client.get_approval_status("exec_789")
print(status["approval"]["status"])  # "pending" | "approved" | "rejected"

Related Docs