Testing Guide

How to Test Your ACP Endpoints: A Step-by-Step Guide

· 10 min read

You've implemented the Agentic Commerce Protocol. Your Shopify store has an ACP endpoint. Stripe's Agentic Commerce Suite is configured. But how do you know it actually works when a real AI agent tries to buy something?

This guide walks through testing your ACP endpoints systematically — from basic API validation to full synthetic agent simulation.


Before You Start: What You Need

  • Your ACP endpoint URL (e.g., https://api.yourstore.com/acp/v1)
  • A sandbox API key (Bearer token)
  • Your HMAC signing key (for request signatures)
  • A test payment token from Stripe sandbox (tok_visa works)
  • cURL, Postman, or any HTTP client

Level 1: Basic API Validation (10 minutes)

Before simulating agents, verify the raw API works.

Test 1: Create a Checkout Session

curl -X POST https://api.yourstore.com/acp/v1/checkout_sessions \
  -H "Authorization: Bearer YOUR_SANDBOX_KEY" \
  -H "Content-Type: application/json" \
  -H "API-Version: 2026-01-30" \
  -H "Idempotency-Key: test-$(date +%s)" \
  -d '{
    "items": [{ "id": "YOUR_SKU", "quantity": 1 }],
    "buyer": {
      "first_name": "Test",
      "last_name": "Buyer",
      "email": "test@example.com"
    },
    "fulfillment_address": {
      "name": "Test Buyer",
      "line_one": "123 Main St",
      "city": "San Francisco",
      "state": "CA",
      "country": "US",
      "postal_code": "94105"
    }
  }'

What to check:

Check Expected If it fails
Status code 201 Created Your endpoint isn't creating sessions correctly
status field not_ready_for_payment or ready_for_payment State machine is wrong
line_items Non-empty array with correct SKU Product lookup is broken
line_items[0].total Greater than 0 (in minor units) Pricing calculation is wrong
fulfillment_options Non-empty array Shipping options aren't exposed
totals Array includes subtotal, tax, total Missing totals calculation
payment_handlers At least one handler declared Payment config missing
currency "usd" (lowercase ISO-4217) Currency formatting wrong

Test 2: Idempotency

Send the exact same request with the same Idempotency-Key:

# Send twice with identical key
curl -X POST https://api.yourstore.com/acp/v1/checkout_sessions \
  -H "Authorization: Bearer YOUR_SANDBOX_KEY" \
  -H "Content-Type: application/json" \
  -H "API-Version: 2026-01-30" \
  -H "Idempotency-Key: idempotency-test-001" \
  -d '{"items": [{"id": "YOUR_SKU", "quantity": 1}], "buyer": {"email": "test@example.com"}}'

# Same request, same key — should return same session
curl -X POST https://api.yourstore.com/acp/v1/checkout_sessions \
  -H "Authorization: Bearer YOUR_SANDBOX_KEY" \
  -H "Content-Type: application/json" \
  -H "API-Version: 2026-01-30" \
  -H "Idempotency-Key: idempotency-test-001" \
  -d '{"items": [{"id": "YOUR_SKU", "quantity": 1}], "buyer": {"email": "test@example.com"}}'

Expected: Both responses return the same id. If you get two different session IDs, your idempotency implementation is broken. This is the bug that silently killed conversion on OpenAI's Instant Checkout.

Test 3: Error Responses

# Request with an invalid SKU
curl -X POST https://api.yourstore.com/acp/v1/checkout_sessions \
  -H "Authorization: Bearer YOUR_SANDBOX_KEY" \
  -H "Content-Type: application/json" \
  -H "API-Version: 2026-01-30" \
  -H "Idempotency-Key: error-test-$(date +%s)" \
  -d '{"items": [{"id": "NONEXISTENT_SKU", "quantity": 1}], "buyer": {"email": "test@example.com"}}'

Expected: 400 Bad Request with a proper ACP error:

{
  "type": "error",
  "code": "out_of_stock",
  "message": "Product NONEXISTENT_SKU not found",
  "param": "items[0].id"
}

If you get a 500 Internal Server Error: Your endpoint is throwing unhandled exceptions on invalid input. This is the #1 most common ACP implementation bug. Agents will receive a useless error and silently abandon the checkout.


Level 2: State Machine Validation (20 minutes)

The ACP checkout is a state machine. Each transition has rules. Test that your states work correctly.

The ACP State Machine

                    ┌──────────────────┐
                    │  not_ready_for   │
    create ────────▶│    _payment      │◀──── update (missing fields)
                    └────────┬─────────┘
                             │ update (all fields provided)
                             ▼
                    ┌──────────────────┐
                    │  ready_for       │
    update ────────▶│    _payment      │◀──── update
                    └────────┬─────────┘
                             │ complete
                             ▼
                    ┌──────────────────┐
                    │   completed      │
                    └──────────────────┘

    Any state ────▶ canceled (via cancel endpoint)

Test 4: Full Happy Path

Walk through the complete flow:

  1. Create — Verify state is not_ready_for_payment
  2. Update with fulfillment selection — Verify state transitions to ready_for_payment
  3. Complete with payment token — Verify state is completed and order_id is present
  4. Get the completed session — Verify it returns the final state

Test 5: Invalid State Transitions

Try completing a session that's not_ready_for_payment:

# Create a session without fulfillment selection (stays not_ready)
# Then immediately try to complete it
curl -X POST https://api.yourstore.com/acp/v1/checkout_sessions/{id}/complete \
  -H "Authorization: Bearer YOUR_SANDBOX_KEY" \
  -H "Content-Type: application/json" \
  -d '{"payment_data": {"handler_id": "card_tokenized", ...}}'

Expected: 400 or 422 error — not a 200 that creates an incomplete order.

Test 6: Cancel Flow

# Create a session, then cancel it
curl -X POST https://api.yourstore.com/acp/v1/checkout_sessions/{id}/cancel \
  -H "Authorization: Bearer YOUR_SANDBOX_KEY"

Expected: 200 with status: "canceled". Then verify:

  • GET the canceled session — still returns canceled (not deleted)
  • Try to update the canceled session — should return 405 Method Not Allowed
  • Try to complete the canceled session — should return 405

Level 3: Edge Case Testing (30 minutes)

This is where most implementations break. Real agents trigger these scenarios constantly.

Test 7: Out-of-Stock Mid-Checkout

  1. Create a session with an item that has low inventory
  2. Wait (or manually reduce inventory via your admin)
  3. Try to complete the checkout

What you're looking for: Does the endpoint return a clear out_of_stock error with the specific item ID? Or does it 500?

Test 8: Price Change During Session

  1. Create a session with an item at $49.99
  2. Change the item's price to $59.99 via your admin
  3. Complete the checkout

Question: Which price does the customer pay? The session should either: (a) lock the price at creation time, or (b) return an error on complete indicating the price changed. What it should NOT do: silently charge the new price without informing the agent.

Test 9: International Address

curl -X POST https://api.yourstore.com/acp/v1/checkout_sessions \
  -H "Authorization: Bearer YOUR_SANDBOX_KEY" \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: intl-test-$(date +%s)" \
  -d '{
    "items": [{"id": "YOUR_SKU", "quantity": 1}],
    "buyer": {"email": "test@example.com"},
    "fulfillment_address": {
      "name": "Test Buyer",
      "line_one": "10 Downing Street",
      "city": "London",
      "state": "",
      "country": "GB",
      "postal_code": "SW1A 2AA"
    }
  }'

If you don't ship internationally: You should get a clear error: {"type": "error", "code": "address_invalid", "message": "We do not ship to GB"}. Not a 500. Not a silent acceptance that creates an unfulfillable order.

Test 10: Quantity Limits

Try ordering 999 of the same item. Try ordering 0. Try ordering -1. Your endpoint should validate quantity bounds and return meaningful errors.

Test 11: Session Timeout

Create a session and don't touch it for 30+ minutes. Then try to complete it. Does your endpoint handle stale sessions gracefully? The ACP spec doesn't mandate a timeout, but real implementations should expire sessions to prevent inventory lock-up.


Level 4: Performance Testing (15 minutes)

Agents have timeout thresholds. If your endpoint takes too long, the agent abandons.

Test 12: Latency Profile

Run 20 requests against each endpoint and measure response times:

# Simple latency test (repeat 20 times)
for i in $(seq 1 20); do
  time curl -s -o /dev/null -w "%{time_total}" \
    https://api.yourstore.com/acp/v1/checkout_sessions/{id}
done

Target thresholds:

Endpoint p50 p95 Max acceptable
create_checkout <500ms <1.5s 3s
get_checkout <200ms <500ms 1s
update_checkout <500ms <1.5s 3s
complete_checkout <1s <3s 5s

If your complete_checkout p95 exceeds 3 seconds, agents will start timing out. This was a documented issue in the Alpine Gear audit — 8.9 second p99 on complete_checkout was causing silent session failures.


Level 5: Synthetic Agent Simulation (The Full Test)

This is what AgentCheck automates — but here's how to do a basic version manually.

DIY Synthetic Agent Test

Use Claude or GPT-4o with tool-use to simulate a shopping session:

System prompt: You are testing an e-commerce checkout endpoint.
Your goal: buy one pair of running shoes under $100.

You have these tools:
- create_checkout(items, buyer, address)
- update_checkout(session_id, fulfillment_option_id)
- complete_checkout(session_id, payment_token)
- get_checkout(session_id)

Walk through a complete purchase. At each step, examine the response
and report anything unusual: missing fields, unexpected errors, slow
responses, incorrect pricing.

Use this buyer info:
- Name: Alex Test
- Email: alex@test.com
- Address: 456 Oak Ave, Portland, OR 97201

Use payment token: tok_visa

Give the LLM your endpoint URL and credentials, let it run. The agent's "unusual findings" are your bug report.

What the Agent Will Catch That You Won't

  • Fields your integration returns that no human would notice are missing (like display_text on totals)
  • Error messages that are technically correct but incomprehensible to an AI agent
  • State transitions that work but are semantically confusing
  • Fulfillment options that are listed but can't actually be selected
  • Payment handlers declared but not actually functional

The AgentCheck Checklist

Before going live, verify every item:

Protocol Compliance

  • All responses match ACP spec v2026-01-30 JSON schema
  • Status codes are correct (201 on create, 200 on update/complete/get)
  • Error responses follow { type, code, message, param } format
  • Idempotency-Key is properly implemented
  • HMAC-SHA256 signature verification works
  • Bearer token authentication works (and rejects invalid tokens)

Checkout Flow

  • Create → Update → Complete happy path works end-to-end
  • State machine transitions are correct
  • Cancel works from any non-terminal state
  • Invalid state transitions return errors (not silent failures)

Data Quality

  • Product prices match your live storefront
  • Inventory is current (not stale by >5 minutes)
  • Tax calculation is correct for test addresses
  • Fulfillment options include cost and delivery window

Edge Cases

  • Out-of-stock returns proper out_of_stock error code
  • Invalid addresses return address_invalid error code
  • Payment decline returns payment_declined error code
  • Quantity limits are validated
  • International addresses handled (accept or reject gracefully)

Performance

  • All endpoints respond within 3 seconds at p95
  • complete_checkout responds within 5 seconds at p99
  • No latency regression under concurrent requests

Don't want to run these tests manually? AgentCheck automates all of this →


Sources

?

Your agents are already shopping. Is your checkout ready?

Agent-driven commerce traffic is projected to grow 1,200% over the next two years. Businesses that aren't ready will lose sales to those that are. Prova gives your checkout the machine-readable layer it needs so AI agents can discover, validate, and complete purchases — without friction.

MCP & A2A protocols supported Sandbox-only environment SOC 2 compliance planned