Testing and iterating on daemons

Human readers: the fastest way to get started with daemons is to work directly with Charlie.Charlie can help you suggest where to start and add your first daemon. Use the prompts below to ask Charlie in Slack, GitHub, or Linear.

Explain what daemons are and how they work.

What are some use cases for daemons?
Which would you recommend starting with?

Open a PR to add a simple daemon to my repo.
Explain what the daemon does and why you think it's a good place to start.

Use this page when a daemon file already exists and you need to roll it out safely, observe its behavior, and tighten or widen it over time.

Start with the rollout mindset

Early testing is not about proving the daemon can do everything. Early testing is about proving the daemon behaves correctly and quietly under narrow, intentional conditions. Use your team’s normal activation workflow for rollout. A daemon becomes eligible for live activations after both are true:

the updated DAEMON.md is merged to the repo’s default branch
Charlie has ingested that merged version

Once both are true, the daemon is live:

current watch conditions can begin matching
a current schedule can begin driving scheduled activations
the first question is not “can it do more?” but “can it behave correctly under narrow conditions?”

Start narrow. Prefer low-blast-radius outputs first. Widen only after repeated correct behavior. Use the daemon file itself to enforce narrowness. Do not rely on people remembering to be careful during rollout.

Keep the file model straight

The authored frontmatter fields are:

id
purpose
watch
routines
deny
schedule

schedule uses standard five-field cron syntax. The markdown body can include headings like Decision policy, Communication policy, Verification and freshness, Limits, Coordination, and Ignore patterns, but those headings are recommended conventions, not frontmatter schema. Activation-mode labels such as Watch-only, Schedule-only, and Hybrid are runtime-derived from watch and schedule. Authors do not write them as frontmatter fields.

What you can observe today

Today, there is no dedicated daemon activity or logs page in the Charlie dashboard. Observe daemon behavior on native surfaces:

GitHub
Slack
Linear

Because there is no dedicated activity page today, human rollout verification usually means inspecting the daemon’s visible work on those systems. That is different from daemon-internal verification. If the daemon file says the daemon must run checks, tests, API probes, freshness checks, or other validation before a consequential write, confirm the daemon actually performed that verification before trusting the activation.

Containment levers

Use these authoring levers to reduce blast radius before rollout.

Narrow `watch`

Make watch conditions specific enough for early rollout without making them misleading as a durable contract. Examples of narrow testing patterns:

only react to PRs opened by the person creating and testing the daemon
only react to a labeled test PR
only react to a deliberately narrow category of change
only react to a surface where the tester can inspect every result closely

Temporary overfitting is acceptable during rollout, such as limiting activation to a test label, test branch, or test author. Remove that overfitting before normal rollout unless it is part of the daemon’s real long-term role.

Narrow target boundaries

Limit where the daemon pays attention. Put hard target boundaries in watch, routines, or deny when possible; use body policy only for nuanced target-selection behavior. Examples:

only a specific directory or file family
only PRs targeting one branch or branch pattern
only one issue label or PR label category
only a sandbox repo for the first rollout

Add Ignore patterns

Skip noise early. Examples:

ignore bot-authored activity
ignore generated files
ignore surfaces or categories that would create noisy false positives during testing

Add stronger `deny`

If the daemon could take risky actions, deny them during early rollout and widen later only if needed.

Add Limits

Constrain the daemon’s output volume. Examples:

at most N items per activation
no more than N visible actions per day
stop producing new work when too much prior daemon work is still waiting on humans

Add Coordination

Coordination rules are a containment lever, not just a cleanup detail. Examples:

do not comment where a human review is already in progress
do not duplicate work another daemon already started
use a label or ownership convention so other daemons can filter early-rollout work

Constrain output surfaces first

Prefer outputs that only the tester or a very small audience will see first. Examples:

send Slack DMs to the daemon author first
comment only on a labeled test PR
keep early outputs on one low-blast-radius surface

Constrain who or what the daemon touches first

Keep the first target set intentionally small. Examples:

only act on PRs opened by the daemon author
only act in a sandbox repo first
only act on a test branch pattern first
only act on one issue label or PR label first

Testing by activation type

Event wakes respond to a triggering signal. Scheduled wakes survey their target set and prioritize within it. Hybrid daemons need both postures, so test them separately at first.

Watch-driven daemons

Use watch-driven testing when the daemon should react to a discrete event. Current rollout facts that matter:

event-driven wakes can come from routed GitHub, Linear, and Slack events once repo inference selects a repo with daemon inventory
Linear tests depend on a connected Linear workspace, the issue’s Linear team mapping, and routing content that clearly contains the observable trigger; Slack tests depend on a connected Slack workspace and Slack workspace mapping
watch conditions are interpreted semantically, so concrete observable phrasing is more reliable than vague wording

A good rollout pattern for a watch-driven daemon is:

make watch narrow
constrain target boundaries, Ignore patterns, and Coordination
add strong deny and Limits
create a small number of deliberate test events
inspect visible output on GitHub, Slack, or Linear
tighten or widen the file and repeat

If the daemon is noisy, first ask:

did it wake for the wrong event?
or did it wake correctly and act too broadly once awake?

Wrong wakes usually point to watch, target boundaries, or Ignore patterns. Wrong actions usually point to routines, deny, Limits, Coordination, or other body guidance.

Linear testing recipe

Use a Linear test when the daemon should respond to supported issue events.

Confirm the Linear integration is connected and the Linear team you will test in is mapped to the intended repo.
Pick one low-risk issue path to test.
Use concrete watch wording that tests a supported issue create or issue-comment create event rather than the repo/team mapping precondition, such as “A new Linear issue contains the exact phrase daemon-test-12345” or “A Linear issue comment contains the exact phrase daemon-test-12345.”
Trigger that supported low-risk event by creating the test issue or adding the unique phrase as a normal issue comment. Use Charlie mention or assignment only when intentionally testing direct-invocation routing.
Inspect the daemon’s visible output on Linear and any linked GitHub or Slack surface.
Confirm it did not act on unrelated issues outside the mapped team or daemon scope.

Slack testing recipe

Use a Slack test when the daemon should respond to channel, thread, or DM activity.

Confirm the Slack integration is connected and the Slack workspace is mapped to the intended repo.
Pick one low-risk channel, thread, or DM path to test.
Use concrete watch wording, such as “A Slack thread reply is added in the named support channel.”
Trigger one event: a mention, thread reply, broader channel message, or DM. For broad message wakes, narrow matching to intentional channels in watch wording or body policy.
Inspect the daemon’s visible output in Slack and any linked Linear or GitHub surface.
Confirm it did not respond to unrelated channel traffic.

Schedule-driven daemons

Use schedule-driven testing when the daemon should wake on time and survey what needs attention inside a defined scope. Current rollout facts that matter:

invalid cron strings are rejected when daemon config is refreshed
if a schedule update is invalid and a previous valid schedule already exists, Charlie keeps the previous valid schedule until the cron value is fixed
schedule-based activations do not replay a full backlog of missed ticks
after downtime, the scheduler catches up at most one missed tick, then continues from current time

A good rollout pattern for a schedule-driven daemon is:

start with conservative body guidance and low-blast-radius outputs
add strong Limits
make target boundaries intentionally narrow
choose a schedule that lets the tester observe the first activations closely
inspect the resulting visible actions
widen gradually only after behavior is consistently correct

If the daemon is too noisy on schedule:

tighten target boundaries
add or strengthen Limits
narrow the routines to the minimum correct work

If the daemon is too passive on schedule:

inspect whether target boundaries are too narrow
inspect whether deny or Limits are over-constraining it
inspect whether routines describe the intended action clearly enough

Hybrid daemons

Use hybrid testing when the daemon needs both event-driven reaction and scheduled review. Do not test both wake paths broadly at once. Get one wake path behaving well first, then add the second. A good order is:

test the lower-blast-radius path first
keep the other path narrow or conservative
observe several correct activations
widen one dimension at a time

What actually wakes daemons today

Today:

event-driven wakes can come from routed GitHub, Linear, and Slack events after repo inference selects a repo with daemon inventory
watch matching is semantic
schedule drives scheduled activations
activation mode is derived from watch and schedule, not authored as its own field

For safe rollout, the important distinction is whether the daemon woke because of a signal or because a schedule fired.

How to verify whether a daemon is working

Check the daemon against its own file. For each activation, ask:

Did it wake for the right reason?
Did it follow the daemon’s purpose?
Did it perform one or more of its routines?
Did it avoid anything in deny?
Did it follow the body guidance for decision policy, communication policy, verification, freshness, limits, coordination, and ignore patterns?
Did it run any daemon-internal verification required by its file?
Did it produce work on the right native surface?
Did it do the minimum correct work for this activation?

Separate wake context from action scope. watch and schedule explain why the daemon woke now. They do not expand what the daemon is allowed to do. Permission still comes from the daemon’s purpose, routines, deny, and body guidance. Because there is no dedicated activity page today, human rollout verification usually means reading the daemon’s visible work on GitHub, Slack, and Linear. It should also include checking whether the daemon performed any internal verification its file required before acting.

A simple iteration loop

Use this loop:

narrow the daemon file before rollout
activate it through the team’s normal rollout workflow
observe a small number of live activations
inspect the visible outputs
decide whether the problem is:
- wrong wake
- wrong action
- missing Limit
- missing target boundary or Ignore patterns
- missing Coordination rule
- missing deny
edit the daemon file
merge or activate the tighter or wider version
repeat

The main debugging surface is the daemon file itself. DAEMON.md is the daemon’s canonical operating brief and primary role policy. When the daemon behaves incorrectly, fix the authored policy and guidance rather than trying to prompt around the problem.

What to change first when something goes wrong

If the daemon wakes too often

Look first at:

watch
schedule
target boundaries
Ignore patterns

If the daemon wakes correctly but is too chatty

Look first at:

routines
Limits
Communication policy
Coordination

If the daemon takes actions it should not take

Look first at:

deny
target boundaries
Decision policy

If the daemon is too passive

Look first at:

whether target boundaries are too narrow
whether routines are too weak or vague
whether deny rules are over-constraining it

Widening scope safely

Widen one dimension at a time. Examples:

from “only PRs opened by the tester” to “one small team’s PRs”
from tester-only Slack DMs to a small shared channel
from one labeled test PR to one label category
from a sandbox repo to one low-risk production repo area
from one test branch pattern to one broader branch target
from a narrow directory to a broader repo area
from a small activation volume to a larger one

Do not widen scope, output audience, routine breadth, and schedule intensity all at once. There is no fixed required number of successful activations defined here. In practice, widen only after several quiet, correct, low-blast-radius activations. Signals that a daemon is still too broad or under-constrained include:

it comments where humans are already actively working
it repeats work another daemon already did
it produces more visible output than the team can review
it touches targets outside the intended early rollout set
it acts instead of stopping, commenting with the blocking reason, or asking for specific human input on ambiguous cases

Dampening or stopping a noisy daemon

If a daemon is producing noise, merge a more restrictive daemon file quickly. Typical dampening moves:

tighten watch
narrow target boundaries
add Ignore patterns
add or strengthen deny
add or strengthen Limits
add stronger Coordination rules
stop/no-op, comment with the blocking reason, or ask for specific human input instead of acting directly
reduce visible output breadth
route early outputs back to low-blast-radius surfaces

For scheduled daemons, removing schedule is the normal way to stop future timed activations. In rare stale-state windows, that change may not take effect instantly, so confirm on native surfaces before assuming the schedule path is fully stopped.

Testing checklist

Before rollout or widening, run structural validation in the repository that contains the daemon files:

bunx @charlie-labs/daemons validate --all

This checks file structure only. It does not prove live daemon behavior, wake routing, integration access, output quality, or any daemon-internal verification the file asks Charlie to perform. Before widening a daemon, confirm:

bunx @charlie-labs/daemons validate --all passes
the daemon woke only in the situations you expected
overlapping watch entries did not cause duplicate activations for the same underlying signal
the daemon’s visible outputs were easy to review
the daemon followed its own purpose and routines
the daemon respected deny and Limits
the daemon respected target boundaries, Coordination, and Ignore patterns
the daemon re-checked current state before acting when state could have changed
the daemon ran any internal verification required by its file
when the daemon no-oped or completed a low-noise action silently, that silence was consistent with the daemon’s communication policy
the daemon did not create more work than the team could absorb
the next widening step is small and deliberate

If any of those are not true, tighten the daemon and test again.

Getting Started

Daemons

Working with Charlie

Integrations

Testing and iterating on daemons

Start with the rollout mindset

Keep the file model straight

What you can observe today

Containment levers

Narrow `watch`

Narrow target boundaries

Add Ignore patterns

Add stronger `deny`

Add Limits

Add Coordination

Constrain output surfaces first

Constrain who or what the daemon touches first

Testing by activation type

Watch-driven daemons

Linear testing recipe

Slack testing recipe

Schedule-driven daemons

Hybrid daemons

What actually wakes daemons today

How to verify whether a daemon is working

A simple iteration loop

What to change first when something goes wrong

If the daemon wakes too often

If the daemon wakes correctly but is too chatty

If the daemon takes actions it should not take

If the daemon is too passive

Widening scope safely

Dampening or stopping a noisy daemon

Testing checklist

​Start with the rollout mindset

​Keep the file model straight

​What you can observe today

​Containment levers

​Narrow watch

​Narrow target boundaries

​Add Ignore patterns

​Add stronger deny

​Add Limits

​Add Coordination

​Constrain output surfaces first

​Constrain who or what the daemon touches first

​Testing by activation type

​Watch-driven daemons

​Linear testing recipe

​Slack testing recipe

​Schedule-driven daemons

​Hybrid daemons

​What actually wakes daemons today

​How to verify whether a daemon is working

​A simple iteration loop

​What to change first when something goes wrong

​If the daemon wakes too often

​If the daemon wakes correctly but is too chatty

​If the daemon takes actions it should not take

​If the daemon is too passive

​Widening scope safely

​Dampening or stopping a noisy daemon

​Testing checklist

Start with the rollout mindset

Keep the file model straight

What you can observe today

Containment levers

Narrow `watch`

Narrow target boundaries

Add Ignore patterns

Add stronger `deny`

Add Limits

Add Coordination

Constrain output surfaces first

Constrain who or what the daemon touches first

Testing by activation type

Watch-driven daemons

Linear testing recipe

Slack testing recipe

Schedule-driven daemons

Hybrid daemons

What actually wakes daemons today

How to verify whether a daemon is working

A simple iteration loop

What to change first when something goes wrong

If the daemon wakes too often

If the daemon wakes correctly but is too chatty

If the daemon takes actions it should not take

If the daemon is too passive

Widening scope safely

Dampening or stopping a noisy daemon

Testing checklist