Integration Testing

deploy → verify comprehensive checks → teardown — end-to-end on real cloudflare infrastructure

Overview
The integration test runs a complete deploy → smoke test → teardown cycle against a real Cloudflare deployment. It deploys the full platform (6 workers, D1, KV, dispatch namespace), creates test users, pushes a fixture project, runs comprehensive verification steps covering every feature, then tears everything down. Nothing is mocked — this tests the actual production code path.
01
Deploy
6 workers + infra
02
Smoke Test
many steps × N
03
Teardown
delete everything
Run It
npm run cli-infra -- integration-test 3
Deploys, runs 3 concurrent smoke tests, tears down.
Private Only
Tests refuse --prod. They only run against private deployments prefixed with PRIVATE_PLATFORM.
Always Tears Down
Teardown runs in a finally block, so even if tests fail, resources are always cleaned up.
Deploy
The deploy step provisions the entire Fling platform on Cloudflare. All resources are prefixed with the PRIVATE_PLATFORM value to avoid colliding with production or other developers' deployments.
Step 1
Validate environment
provision storage
D1
{prefix}-platform
KV
{prefix}-usage
Namespace
{prefix}-users
run D1 migrations
Config
Generate wrangler.toml for each worker
deploy in order
{prefix}-discord
Plugin worker
{prefix}-slack
Plugin worker
{prefix}-api
Platform API
{prefix}-dispatch
Request router
{prefix}-cron
Cron scheduler
{prefix}-email-inbound
Email receiver
set secrets + fake services
Ready
Poll /health until API responds
Dependency order matters. Plugin workers are deployed first because the API worker has service bindings (DISCORD_PLUGIN, SLACK_PLUGIN) to them. If deployed in the wrong order, the API worker would fail to bind and reject deploy requests.
Smoke Test Steps
Each smoke test instance creates a test user, scaffolds a project with fixtures, deploys it, and runs comprehensive ordered verification steps. Every major Fling feature is exercised against real Cloudflare infrastructure.
Setup
01
Check Prereqs
CLI build exists
02
API Reachable
GET /health returns ok
03
Cleanup Users
Remove stale test users
04
Create User
st-{id}@test.com + token
05
Setup Project
Init, login, copy fixtures
06
Set Secrets
fling secrets set TEST_SECRET
Local Dev
07
Dev Server
Shared worker tests + cron, email
Deploy & Auth
08
Deploy Push
fling it → live URL
09
Resend Setup
Fake email service tenant
10
Discord Setup
Fake tenant + guild
11
Verify Whoami
CLI auth state check
HTTP & Frontend (shared tests via WorkerTestEnv — same logic as step 07)
12
Health Check
Worker health + secret
13
Frontend Test
Playwright browser check
14
Static Assets
/logo.svg serves correctly
15
SPA Fallback
Unknown routes → index.html
Backend Features (shared tests via WorkerTestEnv — same logic as step 07)
16
API Tests
Todo CRUD + SQL verify
17
WASM Test
5 + 7 = 12 via WASM
18
Storage Tests
R2 upload, download, list, delete
19
Presigned URLs
Direct R2 upload + download
Integrations
20
Discord Tests
Commands, messages, reactions
21
Email Tests
Receive + parse + store
22
Email Verification
Signup → verify → confirmed
23
Unverified Deploy
Must fail for unverified user
Cron & Ops
24
Cron Tests
Register, trigger, history, errors
25
Logs Test
fling logs --prod
26
Feedback Test
fling feedback submit
27
Usage Test
GraphQL analytics query
Multi-Project & Lifecycle
28
Multi-Project
Second project, both work
29
Signup Flow
Signup → verify
30
Slug Change
Rename + redirect check
31
Project Takedown
Delete + verify cleanup
Fixture project. Each smoke test instance deploys a purpose-built fixture with a Hono backend (todos, health, WASM, storage, presigned URLs, Discord, email handlers), a React frontend (displays secrets, uses Tailwind), a WASM module (add.wasm), and a static asset (logo.svg). The fixture exercises every Fling runtime feature.
Stress Test
The stress test runs N smoke test instances in parallel to verify the platform handles concurrent users. Each instance gets its own user, project, and ports, but they share fake service tenants to test real multi-tenant behavior.
Prepare
Build CLI + generate N random IDs
create shared tenants
Discord
1 tenant, N guilds
Resend
From deploy
launch N in parallel
Instance 0
ports 7654 / 8765
Instance 1
ports 7664 / 8775
Instance N
ports +10 each
await all
Summary
N/N passed — teardown test users
Port Isolation
Each instance gets unique local ports: backend 7654 + (i × 10), frontend 8765 + (i × 10). This lets multiple dev servers run simultaneously without port conflicts.
Test User Patterns
Main user: st-{id}@test.com, signup: signup-{id}@test.com, email verify: ev-{id}@test.com. Cleanup matches the st-*, inv-*, ev-* prefixes.
Diagnostics
When a smoke test step fails, the runner automatically dumps diagnostic information to help identify the root cause. Failures are reported with full context — what was expected, what was received, and what the system state looks like.
On Failure
The dumpDiagnostics function runs automatically when any step throws. It collects five categories of information to aid debugging.
Worker Logs
fling logs --prod --since 5m
Last 50 log entries from the deployed worker.
Database Tables
Queries sqlite_master to list all tables, confirming the schema was applied.
Cron Jobs
fling cron list --prod
Shows registered cron jobs and their schedules.
Cron History
fling cron history for the test cron job. Shows invocation timestamps, success/error counts, and error messages.
Dispatcher Diagnostics
Direct /diagnostics endpoint on the cron worker (private only). Shows isDue, nextRun, lastScheduledFor.
Retry Logic
HTTP requests use fetchWithRetry with exponential backoff. The health check polls up to 30 times with 2–10s delays. Deployment propagation gets built-in 3–5s delays.
Preserving Evidence
On failure: test directory and token are preserved for manual debugging. On success: test dirs are cleaned up. Use --no-cleanup to always preserve.
Teardown
Teardown deletes all Cloudflare resources created during deploy. It runs in the integration test's finally block, so it always executes even if tests fail. Every step is non-fatal — individual failures are logged as warnings, and teardown continues.
Start
Begin teardown (non-fatal)
DNS records
Custom hostname
delete workers
API
Dispatch
Cron
Email
Discord
Slack
delete namespace
D1 databases
KV namespace
R2 buckets
Done
All resources deleted
Non-fatal by design. If deleting the D1 database fails (maybe it's already gone), teardown logs a warning and continues to delete KV, R2, and everything else. This ensures partial failures don't leave resources behind. The teardown also refuses --prod as a safety measure.
Configuration
Integration tests are configured via .env variables. The PRIVATE_PLATFORM prefix isolates all resources so multiple developers can test simultaneously without conflicts.
Required
Core Variables
  • CLOUDFLARE_ACCOUNT_ID your CF account
  • CLOUDFLARE_API_TOKEN API token with Workers/D1/KV/R2 access
  • ADMIN_KEY admin API authentication key
  • PRIVATE_PLATFORM resource prefix (e.g., my-fling)
Optional
Fake Services & Extras
  • FAKE_DISCORD_URL mock Discord API
  • FAKE_RESEND_URL mock Resend API
  • SLACK_API_TOKEN Slack notifications
  • R2_ACCESS_KEY_ID R2 storage access
  • DEV_DOMAIN custom dev domain
CLI Options
--verbose shows subprocess output. --skip-deploy skips deployment. --skip-teardown skips teardown. --no-cleanup preserves test directories.
Private vs Production
Private: PRIVATE_PLATFORM=my-fling prefixes everything, uses workers.dev URLs. Production: fixed names, custom domain. Tests always refuse --prod.
Local Test
The local-test command runs the full dev server test suite without any Cloudflare credentials. It reuses the same shared worker tests and local-only tests that the smoke test's dev-server step runs, but in a standalone flow that only needs Node.js and npm.
No cloud credentials needed. The local test scaffolds a temporary project, starts fling dev, and exercises the entire local stack: health checks, secrets, CRUD, storage, presigned URLs, WASM, static assets, frontend HTML, Vite proxy, cron (list/trigger/history/failures), and email triggers.
01
Build CLI
npm run build
02
Scaffold
init + fixtures + install
03
Dev Server
fling dev + all tests
04
Cleanup
remove temp dirs
Run It
npm run cli-infra -- local-test
Add --verbose for detailed output.
No Credentials
No CLOUDFLARE_ACCOUNT_ID, ADMIN_KEY, or PRIVATE_PLATFORM required. Just Node.js and npm.
Custom Ports
--be-port 4000 and --fe-port 5000 override the default backend (7654) and frontend (8765) ports.
Cleanup behavior. On success, both project and config temp directories are deleted. On failure, the project directory is preserved for debugging while the config directory is cleaned up. Use --no-cleanup to always preserve both directories.
Workflow Testing
The workflow system has a layered test suite: unit tests for the runtime integration, stress tests for concurrency, and a comprehensive engine test suite in the vendored flingflow package. All workflow tests run in-memory using MemoryEventStore for fast, isolated execution — no external dependencies required.
Four test layers. The workflow runtime unit tests (28 tests) verify the Fling-side integration. The stress tests (6 tests) verify concurrency and throughput. The flingflow engine tests (82 tests) cover the core event-sourced engine, stores, recovery, context building, and deterministic simulation. Together they ensure workflows are correct under both normal and high-load conditions.
Unit Tests
Workflow Runtime — 28 tests
Located in src/workflow/__tests__/runtime.test.ts. Uses flingflow's MemoryEventStore for fast, isolated testing.
  • Workflow registration and discovery
  • Start workflow and retrieve results
  • Scratchpad read/write persistence
  • Duplicate workflow deduplication
  • NonRetryableError error handling
  • Max attempts and retry limits
  • Get and list workflow queries
  • Engine-not-initialized guard checks
  • Metadata extraction from events
Stress Tests
Concurrency & Throughput — 6 tests
Located in src/workflow/__tests__/stress.test.ts. Verifies correctness under concurrent load.
  • High-volume concurrent workflows (60+)
  • Dedup correctness under concurrency
  • Failure and retry under load
  • Scratchpad integrity across concurrent workflows
  • Mixed workflow types running simultaneously
  • Throughput measurement (~9000 workflows/sec)
flingflow Engine Tests
Core Engine — 82 tests across 7 suites
The vendored packages/flingflow/ package has its own comprehensive test suite covering the event-sourced workflow engine, event stores, recovery, and context building.
Engine Tests
test/engine.test.ts
Core engine lifecycle: register, start, complete, fail, retry, dedup.
Store Conformance
test/store-conformance.ts
Shared test suite run against both MemoryEventStore and SqliteEventStore.
Recovery Tests
test/recovery.test.ts
Stuck workflow detection, heartbeat timeout, recovery re-execution.
Context Tests
test/context.test.ts
Context building from event streams, state reconstruction.
Simulation Tests
test/simulation.test.ts
Deterministic simulation for reproducible workflow testing.
Store Tests
test/store-memory.test.ts test/store-sqlite.test.ts
Store-specific edge cases beyond conformance.
No D1EventStore test file. The D1EventStore adapter (src/worker-runtime/d1-event-store.ts) is tested indirectly through the workflow runtime tests and integration tests. It follows the same EventStore interface validated by flingflow's store conformance suite.
Running Workflow Tests
Commands
Workflow tests can be run independently or as part of the full test suite.
01
Runtime Tests
npx vitest run src/workflow/__tests__/ — runs all 28 unit tests + 6 stress tests
02
flingflow Engine
cd packages/flingflow && npm test — runs all 82 engine tests
03
flingflow Stress CLI
cd packages/flingflow && npm run stress — runs the flingflow stress test CLI with throughput benchmarks
04
Full Suite
npm run test:run — runs all tests including workflow tests as part of the pre-commit check
Integration
Smoke Test: Workflow Execution
The same testWorkflow() function runs against both the local dev server (Step 7) and the deployed worker (Step 27), verifying end-to-end workflow execution in both environments.

The smoke test workflow has three step types: start (doubles the input), sleep (sleeps 10s per invocation, 60s total across 6 iterations), and persist (writes the result to D1). The 60s total sleep exceeds the 50s queue consumer execution budget, forcing a re-enqueue on deployed — proving the queue continuation path works on real infrastructure.

The test verifies: workflow creation, step execution, scratchpad data surviving across re-enqueue, D1/SQLite persistence by run_id, deduplication, and all four CLI commands (fling workflow list, show, show -v, start).