Attendly ยท AI Harness Onboarding
01 / 18
Attendly

AI Harness

Developer Onboarding ยท 10 min

Why this exists

Three problems that drove us to build the harness.

Inconsistent specs

Specs lived in Linear, Slack, Notion, heads โ€” scattered across too many places.

Silent regressions

Features broke when adjacent code was touched. No one was looking.

Ad-hoc review

Code review happened when someone had time. Not every change was audited.

Three-Phase Flow

Verifiable end to end. Same playbook for every change.

PHASE 1

Plan

Source โ†’ proposal โ†’ validate โ†’ grill-me โ†’ alignment

PHASE 2

Implement

Ralph Loop per task ยท checkpoint commits

PHASE 3

Ship

Review + auto-fix + contract + changelog + deploy

Phase 1 ยท Plan

Source โ†’ proposal โ†’ 3 parallel validators โ†’ composite score

Source of Intent

  • Linear ticket
  • Video transcript
  • Verbal prompt
  • Slack thread
pinned in proposal.md
openspec/changes/<id>/
โ”œโ”€ proposal.md
โ”œโ”€ design.md
โ”œโ”€ tasks.md
โ””โ”€ specs/

Requirement

  • completeness
  • traceability
  • clarity
score 0 โ€“ 10

NFR

  • coverage
  • baseline
  • testability
score 0 โ€“ 10

Impact

  • coverage
  • dep. depth
  • risk ID
score 0 โ€“ 10

Composite score

(requirement + nfr + impact) / 3

The proposal reads the raw source directly โ€” no pre-refinement. Separating refinement from authoring creates anchoring bias that degrades the alignment check later.

Grill-me ยท Pre-mortem ยท Re-validation

Mandatory in Attendly. Exception scenarios are explicit, not implicit.

1. Grill-me

MANDATORY in Attendly
  • Actor & access
  • Happy path
  • Data model
  • Edge cases
  • Error handling
  • Multi-tenant
  • Existing behavior

2. Pre-mortem

MANDATORY โ€” Critical/High
  • data
  • load
  • partial failure
  • rollout
  • external
  • security
  • adjacent

3. Re-validation

FINAL composite score
  • 3 validators re-run
  • Pre-grill vs post-grill
  • Delta shown to user
  • Both kept in history
  • Approval uses FINAL
  • Critical pre-mortem risks โ†’ task or acceptance

Phase 1 ยท Alignment Check

The last gate of Phase 1. Clean-context comparison on Sonnet.

Raw source

ticket ยท video ยท prompt

Unchanged since the proposal was written.

alignment-validator
clean context
model: sonnet
Covered
source โ†’ spec mapping
Gaps
source missing in spec
Scope creep
spec beyond source (classified)
Intent mismatch
behavior disagrees

๐Ÿšจ Conditional scope modifiers sweep

Dedicated pass for stray-sentence qualifiers โ€” the #1 source of post-deploy regressions. Modifier gaps default to Critical.

tenant segment time window feature flag geography tier device integration data state ordering

Example: "only for Sant'Anna, high school students only" โ€” buried in paragraph 3 of a long ticket. Caught before production.

Verdict: Aligned โ†’ proceed ยท Misaligned โ†’ grill-me / edit spec, re-run alignment.

Phase 2 ยท Implement

Ralph Loop per task ยท checkpoint commits ยท 3-attempt convergence rule.

๐Ÿ”ด Red

Write failing test

๐ŸŸข Green

Make it pass

๐Ÿ”ต Refactor

Improve, stay green

Checkpoint commits

  • task 1 โœ“
  • task 2 โœ“
  • task 3 โœ“
  • โ€ฆ
feat(advisor-ranking): normalize weights before scoring
Co-Authored-By: Claude <agent> ยท type(scope): description

Phase 3a + 3b ยท Review & Auto-Fix

Parallel sub-agent review. Critical/High auto-fixed. Drill on design decisions.

code-reviewer
  • architecture
  • data access
  • security
  • performance
  • testing
  • code quality
regression-analysis
  • contract violations
  • blast radius
  • exception paths
  • perf degradation

Auto-Fix โ€” Exception Paths

  • Critical / Major (review) + Critical / High (regression) โ†’ attempt fix autonomously
  • Fix needs a design decision (hard-error vs warn, A vs B) โ†’ STOP + drill user via AskUserQuestion
  • 3-attempt convergence rule โ†’ if fix doesn't converge, revert and report
  • Fix breaks pre-existing tests โ†’ revert (never degrade test integrity)
  • Re-score after fix ยท both runs kept in daily score JSON history

Phase 3c ยท Contract Check

Binary verdict: PASS or FAIL. Breaking changes require explicit mitigation.

GraphQL Schema

  • field removal
  • required arg add
  • type rename
  • nullable โ†’ non-null

Prisma / DB

  • DROP COLUMN
  • NOT NULL without default
  • column rename
  • narrowing type

TypeScript Public Types

  • removed symbol
  • field removal
  • stricter generic
  • narrower return

REST API when applicable

  • route removal
  • required param add
  • response schema change
  • auth change

Phase 3d + 3e ยท Changelog & Deploy

Deliver the narrative. Verify ready to ship.

3d ยท Changelog

Auto-detects ONE delivery target:

ยท Linear ticket

stakeholder-friendly comment

ยท Open PR

technical description update

ยท CHANGELOG.md

Keep-a-Changelog format

3e ยท Deploy Checklist

Conditional โ€” only on Critical/High or Contract FAIL

  • Migrations tested
  • Rollback plan
  • Feature flags
  • Secrets in vault
  • Runbook updated
  • Observability
  • Dependencies
  • Stakeholders

OpenSpec Structure

One source of truth for architecture, specs, and change history.

openspec/
โ”œโ”€ project.mdโ€” project overview, stack, conventions
โ”œโ”€ nfr.mdโ€” non-functional requirements baseline
โ”œโ”€ adr/โ€” architecture decision records (MADR)
โ”œโ”€ specs/โ€” stable capability descriptions
โ”œโ”€ changes/โ€” active + archived proposals
โ””โ”€ scores/โ€” per-change evaluation history (JSON)

Six Sub-Agents

Each in an isolated context. Four in Phase 1, two in Phase 3.

requirement-validator
Phase 1
completeness ยท traceability ยท criteria_clarity
nfr-validator
Phase 1
coverage ยท baseline_alignment ยท testability
impact-analyzer
Phase 1
coverage ยท dependency_depth ยท risk_identification
alignment-validator
Phase 1 ยท Sonnet
covered ยท gaps ยท scope creep (classified) ยท intent mismatch ยท modifier sweep
code-reviewer
Phase 3
architecture ยท data ยท security ยท performance ยท testing ยท quality
regression-analysis
Phase 3
contract ยท blast radius ยท exception ยท performance

PR Workflow

Every change ships via pull request. No direct pushes.

Create Branch

type/ATT-XXXX-description

Draft PR

when Phase 1 approved ยท still in progress

Ready for Review

when Phase 2 completes ยท tests green

Phase 3 updates PR

changelog ยท scores ยท risk matrix in body

Squash Merge

single commit ยท --delete-branch cleanup

Commands Cheat Sheet

Tags in parentheses show whether a command auto-runs inside the 3-phase flow.

Setup โ€” once per project
/update-skillsinstall / sync skills, rules, agents (manual)
/setup-permissionspre-authorize non-destructive commands (manual)
Diagnostic โ€” anytime
/check-skills-versioncompare project vs latest tooling (manual)
/telemetry reportaggregated view of tool / skill usage (manual)
Workflow โ€” inside the 3-phase flow
/openspec-proposalraw source โ†’ full proposal (auto โ€” Phase 1)
/workflow-impact-analysisblast radius analysis (auto โ€” Phase 1)
/workflow-grill-meinterview + pre-mortem (auto โ€” Phase 1)
/workflow-alignment-checkclosing gate (spec vs source) (auto โ€” Phase 1)
/workflow-contract-checkbreaking-change validation (auto โ€” Phase 3)
/workflow-changelogfinal narrative delivery (auto โ€” Phase 3)
/workflow-refine-ticketstructure a vague ticket into a pre-spec (manual, standalone)

A Typical Task

Mostly autonomous. A handful of human touchpoints.

Paste ticket link
โ†’
openspec-proposal
โ†’
3 validators
โ†’
grill-me answers
โ†’
pre-mortem story
โ†’
re-validation
โ†’
alignment check
โ†’
approve Phase 1
โ†’
Phase 2
โ†’
Phase 3
โ†’
approve merge
๐Ÿ‘ค human touchpoint autonomous

The 6 touchpoints

  • Start โ€” paste the ticket link.
  • Grill-me โ€” answer branch-by-branch questions.
  • Pre-mortem โ€” tell the "two weeks live, something broke" story.
  • Phase 1 approval โ€” confirm the final composite score + alignment verdict.
  • During auto-fix (only if prompted) โ€” resolve a design-decision drill from the AI.
  • Merge โ€” approve the squash merge.

Everything else โ€” proposal drafting, validation, re-validation, implementation, review, regression analysis, auto-fix, contract check, changelog posting โ€” runs on its own.

Getting Started โ€” First Install

Three steps, ~2 minutes. Once per machine + once per project.

1

Clone the tooling repo

git clone <attendly-ai-tooling-url>

Source of truth for skills, agents, and rules. Do this once per machine.

2

Install global commands

cd ai-tooling && ./sync-global.sh

Writes update-skills, setup-permissions, telemetry, check-skills-version into ~/.claude/ so every Claude Code session finds them.

3

Install skills in each project

cd <attendly-project> && claude โ†’ /update-skills

Detects the stack, sets up .claude/ symlinks to .ai-tooling/, initializes OpenSpec, wires up CLAUDE.md.

Staying Updated

/update-skills auto-git-pulls the tooling repo and version-gates.

/update-skills

git pull + compute hash

hash matches?
abort โ€” up to date
sync + show delta
new version installed

Diagnostic, read-only

/check-skills-version

See where you stand without syncing. Reports installed version + latest available + delta.

Zero wasted context โ€” the version gate only syncs when there's something new.

Attendly
Welcome aboard.