Task Metadata Schema

Every Toto task has an optional metadata JSON field (10KB cap). Five fields within this object power Toto's semantic reconciliation -- the system that automatically matches git commits to tasks. Without metadata, matching falls back to title-only keyword overlap, which rarely produces confident results.

This document explains each field, shows good and bad examples, and gives rules for writing metadata that works.


Why Metadata Exists

A bare task like "Fix auth bug" tells the reconciliation engine almost nothing. Dozens of commits could plausibly match that title. But a task with metadata saying it's in the auth-middleware component, will touch app/api/auth_web.py and app/services/auth_service.py, uses keywords like session, cookie, httponly, and has a scope of backend -- that task can be matched with high confidence when a commit touches those exact files with those exact terms.

Metadata turns vague tasks into specific predictions about what code will change. The reconciliation engine compares those predictions against actual commits to find matches.


The 5 Fields

Field Type Description
component string Logical subsystem name. Lowercase, hyphenated.
files string[] File paths expected to change (1-15 paths), relative to repo root.
keywords string[] 4-8 specific technical terms beyond the task title.
scope string (enum) One of: backend, frontend, schema, infra, test, docs, design, ux, ui, sync, mcp, desktop.
intent string One sentence: the observable outcome when this task is done. Must be falsifiable.

component

The logical subsystem this task belongs to. Think of it as the directory-level grouping of your codebase. Use lowercase, hyphenated names like auth-middleware, animation-engine, todo-service, sync-client.

The component name is matched against file paths and commit messages. A task with component auth-middleware gets a signal boost when the commit touches files in app/api/auth_web.py or mentions "auth" in the message.

files

File paths you expect this task to change, relative to the repo root. List actual files, not directories. 1-15 paths.

This is the strongest signal for reconciliation. When a commit's git diff --name-only overlaps with the predicted files, it's a strong indicator of a match. Guess the likely path if the file doesn't exist yet -- the path structure is the signal, not the file's existence.

keywords

4-8 technical terms that will appear in code or commit messages related to this task. Lowercase. Think library names, API names, protocol terms, data structures -- not generic verbs.

Good keywords: oauth, authlib, session, openid, redirect Bad keywords: fix, update, code, change, improve

The test: would a commit on an unrelated task also contain this keyword? If yes, it's too generic.

scope

The layer of the stack this task operates in. This acts as a sanity check -- a backend task that only touches .js files in a commit gets a reduced confidence score.

Valid values: backend, frontend, schema, infra, test, docs, design, ux, ui, sync, mcp, desktop.

intent

One sentence describing the observable outcome when this task is done. Must be falsifiable -- someone should be able to look at the codebase and say "yes, this is true" or "no, this is not true."

Good: "Users can log in via Google OAuth and receive a persistent session" Bad: "Make auth work better"


Examples

Good: Backend Feature

{
  "component": "auth-middleware",
  "files": [
    "app/api/auth_web.py",
    "app/services/auth_service.py",
    "app/models/user.py"
  ],
  "keywords": ["oauth", "google", "authlib", "session", "redirect", "openid"],
  "scope": "backend",
  "intent": "Users can log in via Google OAuth and receive a persistent session"
}

Why this works: specific component name that maps to real paths, exact file predictions, technical keywords that only appear in auth-related code, clear scope, and a falsifiable intent.

Good: Frontend Animation

{
  "component": "animation-engine",
  "files": [
    "app/ui/static/animations/registry.js",
    "app/ui/static/animations/toto-fx.js"
  ],
  "keywords": ["spring", "interpolation", "raf", "opacity", "easing"],
  "scope": "frontend",
  "intent": "Card transitions use spring physics instead of CSS easing"
}

Why this works: file paths are specific .js files in the animations directory, keywords are animation-specific terms unlikely to appear in unrelated commits, scope matches the files.

Good: Infrastructure / Schema

{
  "component": "database-migration",
  "files": [
    "app/models/todo.py",
    "alembic/versions/003_add_metadata.py"
  ],
  "keywords": ["alembic", "migration", "jsonb", "metadata", "column", "constraint"],
  "scope": "schema",
  "intent": "The metadata JSONB column exists on list_items with a 10KB check constraint"
}

Why this works: migration file path follows the Alembic naming convention, keywords include Alembic-specific terms, scope is schema which matches the file types.

Bad: Do Not Do This

{
  "component": "stuff",
  "files": ["app/"],
  "keywords": ["fix", "update", "code", "change"],
  "scope": "backend",
  "intent": "Make it work better"
}

Problems:


Rules

1. Keywords Must Be Specific

Test each keyword: would a commit on a completely unrelated task also contain this word? If yes, replace it.

Worst offenders: code, function, class, module, system, service, data, feature, logic, config, fix, update, add, change.

2. Files Must Be Actual File Paths

Not directories. Not glob patterns. Predict the specific files that will change. If the file doesn't exist yet, guess the path based on the project's naming conventions.

Wrong: app/api/ Right: app/api/auth_web.py

Wrong: *.test.py Right: tests/test_auth_flow.py

3. Intent Must Be Falsifiable

Write it as a claim that can be verified by inspecting the codebase. A good intent reads like a test assertion in plain English.

Wrong: "Improve the auth system" Right: "Users can log in via Google OAuth and receive a persistent session"

Wrong: "Clean up the code" Right: "All auth routes use the AuthIdentity dependency instead of raw device tokens"


Auto-Generation

You don't have to write metadata manually.


Reserved Keys

Keys prefixed with toto_ in the metadata object are reserved for system use. Do not use the toto_ prefix for custom metadata fields.


Size Limit

The metadata JSON field has a 10KB cap enforced at the API level. In practice, well-structured metadata is typically 200-500 bytes.


Further Reading