---
title: "WWT AI Technology Center — Cubie Deployment Playbook"
subtitle: "30-day pilot from infrastructure provisioning to ROI report"
date: "2026-05-30"
---

# WWT ATC Deployment Playbook

**Target:** WWT AI Technology Center.
**Cluster:** 8-16x NVIDIA H100 (or H200, or AMD MI300X — Cubie is architecture-agnostic).
**Duration:** 30 days.
**Outcome:** Customer-deliverable ROI report + WWT case-study draft + go/no-go decision for M3 friendly-customer engagement.

---

## Day 0 — Provisioning (T-3 days)

**WWT side:**

| Item | Spec | Owner |
|---|---|---|
| H100 cluster slot | 8x H100 (80GB) minimum; 16x preferred | ATC infrastructure |
| Inference-gateway VM | Ubuntu 22.04 LTS, 16 vCPU, 64 GB RAM, 500 GB SSD | ATC platform |
| Observability stack | Prometheus + Grafana + Loki (existing ATC) | ATC platform |
| Networking | 25/100 Gbps inbound to gateway; standard ATC routing | ATC network |
| Service account | Read access to DCGM, NVML, gateway logs | ATC platform |

**Centillion side:**

| Item | Spec | Owner |
|---|---|---|
| Cubie sidecar container | `centillion/cubie-admit-gate:v1.2.0` (pulled to ATC registry) | Centillion eng |
| Cubie control-plane container | `centillion/cubie-control:v1.2.0` | Centillion eng |
| Dual-ledger PostgreSQL + OpenBao | Standard HA pair | Centillion eng |
| Telemetry exporter | `centillion/cubie-projector:v1.2.0` (Prometheus format) | Centillion eng |
| Synthetic-traffic generator | `centillion/cubie-fuzzer:v1.2.0` (clean + adversarial + retry-storm mixes) | Centillion eng |
| Deploy script | `wwt/integrations/atc-deploy.sh` | Centillion eng |

---

## Day 1 — Deploy + observe (T+1 day)

### Deploy sequence (executes in ~30 min)

```bash
# 1. Pull images to ATC registry
docker pull centillion/cubie-admit-gate:v1.2.0
docker pull centillion/cubie-control:v1.2.0
docker pull centillion/cubie-projector:v1.2.0

# 2. Provision dual-ledger
docker-compose -f wwt/integrations/atc-stack.yml up -d \
    cubie-ledger-postgres cubie-ledger-openbao

# 3. Start Cubie control plane (observe-only mode)
docker run -d --name cubie-control \
    -e CUBIE_MODE=observe \
    -e CUBIE_LEDGER_PG=postgres://... \
    -e CUBIE_LEDGER_VAULT=https://... \
    -p 8910:8910 \
    centillion/cubie-control:v1.2.0

# 4. Deploy admit-gate sidecar next to inference gateway
docker run -d --name cubie-admit-gate \
    --network host \
    -e CUBIE_UPSTREAM=http://<inference-gateway>:8080 \
    -e CUBIE_CONTROL=http://localhost:8910 \
    -e CUBIE_SHADOW_MODE=1 \
    centillion/cubie-admit-gate:v1.2.0

# 5. Wire Prometheus scrape config
curl -X POST http://<prom-host>:9090/-/reload \
    --data @wwt/integrations/atc-prom-config.yml

# 6. Verify decisions flowing
curl http://localhost:8910/v1/admit/recent | jq '.[:5]'
```

### Verify deploy

- ✅ Cubie health endpoint: `curl http://localhost:8910/health` returns `{"status":"ok","mode":"observe"}`
- ✅ Decision pulses: ≥1 admit decision per second logged in shadow mode
- ✅ Prometheus shows `cubie_admit_decisions_total` counter incrementing
- ✅ Dual-ledger writing: `psql -c "SELECT count(*) FROM consent_log WHERE created_at > NOW() - interval '5 minutes';"` returns non-zero

---

## Day 1-7 — Shadow-mode observation

**Goal:** Capture baseline. Cubie observes; nothing is enforced; no traffic blocked.

**Metric collection per day:**

| Metric | Source | Target capture cadence |
|---|---|---|
| `cubie_admit_decisions_total{verdict="PASS"}` | Prometheus | 15s |
| `cubie_admit_decisions_total{verdict="FAIL"}` | Prometheus | 15s |
| `cubie_admit_decisions_total{verdict="FLUID"}` | Prometheus | 15s |
| `cubie_admit_decisions_total{verdict="TAMPER"}` | Prometheus | 15s |
| `cubie_admit_latency_seconds` (histogram) | Prometheus | 15s |
| Gateway requests/sec | Existing gateway metric | 15s |
| DCGM GPU utilization | DCGM exporter | 15s |
| DCGM GPU memory used | DCGM exporter | 15s |
| DCGM GPU power draw | DCGM exporter | 15s |

**Day 7 checkpoint:**
- Decision distribution (PASS vs FAIL vs FLUID) profile of the customer's real traffic
- Latency overhead (p50, p95, p99 of `cubie_admit_latency_seconds`)
- Confirm: ≥99.9% of decisions complete in <1 ms (target: <100 µs)

---

## Day 8-14 — Active admit on 5% slice

**Goal:** Turn on enforcement for 5% of inbound traffic; measure denied-request behavior; confirm no false positives.

**Activation:**

```bash
# Update control plane to active-admit on 5% slice (header-tagged)
curl -X PATCH http://<cubie-control>:8910/v1/policy \
    -d '{"mode":"active","slice":{"by":"header","key":"x-cubie-pilot","value":"on","pct":5}}'
```

Customer's gateway adds the `x-cubie-pilot: on` header to 5% of requests deterministically (hash of session ID or similar). Cubie's denials are enforced ONLY on those tagged requests.

**Daily measurements (Day 8-14):**

| Metric | Capture |
|---|---|
| Tagged requests passed | counter, Prometheus |
| Tagged requests denied | counter, Prometheus |
| Untagged requests (control) | same metrics, for comparison |
| Inference latency, tagged vs untagged | histogram diff |
| GPU utilization, tagged vs untagged | DCGM, sliced by tag |
| GPU memory, tagged vs untagged | DCGM |
| Energy/request, tagged vs untagged | PDU + DCGM power, divided by useful-req count |
| Denied-request denial-reason histogram | denial-cert log |

**Pass criteria (Day 14 checkpoint):**
- ✅ False-positive rate on tagged slice < 0.5% (denials that, on review, were valid requests)
- ✅ p99 latency on tagged slice within +200 µs of untagged (Cubie overhead absorbed in network noise)
- ✅ GPU utilization on tagged slice ≥ untagged (capacity returned)
- ✅ ≥1 denial class observed (proves Cubie is doing something, not just passing everything)

If pass criteria are not met → halt, investigate, escalate to Centillion engineering. Reversible to shadow mode in seconds.

---

## Day 15-30 — Full active-admit + ROI report

**Goal:** Full enforcement, multi-day soak, customer-facing ROI report.

**Activation:**

```bash
curl -X PATCH http://<cubie-control>:8910/v1/policy \
    -d '{"mode":"active","slice":{"all":true}}'
```

Cubie now decides admit/deny for 100% of inbound traffic. Bypass mode available with `curl -X POST .../bypass` (operator-pulled, audit-logged, time-bound).

**Day 15-29 daily measurements:**

| Metric | Capture |
|---|---|
| Total requests | counter |
| Denials by class (LOCK, JAM, SHATTER, DEFLECT, DENY) | counter, sliced |
| Useful requests (passed + actually got a useful response) | counter |
| Wasted-cycle ratio (1 − useful / total) | derived |
| GPU utilization | DCGM |
| Effective GPU power (W) | PDU + DCGM |
| Energy per useful request (Wh/req) | derived |
| End-to-end latency (p50/p95/p99) | gateway |
| Cost per useful request (USD/req) | billing × usage |

**Day 30 — ROI report deliverable:**

```markdown
# Cubie Pilot — ROI Report — <Customer> — 2026-MM-DD

## Headline numbers (30-day window)

| Metric | Baseline (before Cubie) | With Cubie | Delta |
|---|---|---|---|
| GPU utilization (avg) | X% | Y% | +Zpp |
| Wasted-cycle ratio | X% | Y% | −Zpp |
| Energy per useful request | X Wh | Y Wh | −Z% |
| End-to-end p99 latency | X ms | Y ms | +Z ms (within noise floor) |
| Cost per useful request | $X | $Y | −Z% |
| Denial volume | n/a | X classified by class | n/a |

## Top denial classes (by volume)

[insert from denial-cert log]

## Pen-test pass status

[insert: shadow-mode + active-admit weeks both clean]

## Recommendation

[Pass → proceed to M3 friendly-customer engagement]
[Hold → identified caveats; fix in v1.2.1 patch]
```

---

## Risks + bypass procedure

**Risks tracked throughout pilot:**

- Customer's traffic mix differs from synthetic — handled: observation Days 1-7 build the customer-specific baseline before enforcement
- False-positive volume too high — handled: shadow-mode reveals it before active mode; active mode is reversible
- Latency budget exceeded — handled: Cubie's decision is sub-microsecond on hot cache; if budget exceeded, log + alert + auto-pause
- Customer ops team uncomfortable with active-admit — handled: bypass mode is always available; operator-pulled, audit-logged

**Bypass procedure (operator-callable, 30 sec to invoke):**

```bash
# Pause all enforcement; observe mode resumes
curl -X POST http://<cubie-control>:8910/v1/bypass \
    -d '{"reason":"operator-paused","duration":"30m","operator":"<op-name>"}'

# Resume enforcement
curl -X POST http://<cubie-control>:8910/v1/resume
```

Both operations write to the dual ledger. Customer's audit team can replay the full timeline.

---

## After pilot success (M2 → M3 transition)

| Deliverable | Owner |
|---|---|
| ATC case-study draft | WWT marketing + Centillion engineering |
| WWT internal demo | Joint |
| Friendly-customer identification (Texas A&M class) | WWT channel |
| MSA + NDA template | Legal |
| Customer-specific shadow-mode protocol | Centillion (instantiate from this playbook) |
| Pen-test engagement signed | Centillion |
