The Self-Healing Todo List

How our WeekDo, TodayDo, DailyReview, and long-term goal loop turn task management into an evidence-backed operating system instead of a prettier pile of stale checkboxes.

Most todo lists fail quietly.

Not because the checkboxes are wrong. Checkboxes are simple creatures. They sit where you put them and wait for the human to develop executive function on command. Heroic little squares.

The failure is upstream.

The list forgets why an item mattered. It carries stale work because nobody wants to make the deletion decision. It treats a meeting, a founder commitment, a half-finished agent run, a customer opportunity, and a long-term life goal as the same class of object: line item, maybe due date, maybe guilt.

That is how task management turns into sediment.

So we changed the job of the list.

It is no longer just a place to store tasks. It is a loop that gathers evidence, checks what actually happened, promotes the right work for today, reconciles what did not move, and keeps the whole thing pointed at the long-term goals.

The stack has four layers:

Long-term goals — the durable north star.
WeekDo — the weekly operating plan.
TodayDo — the daily execution cut.
DailyReview — the evening reconciliation pass.

The important part is not the names. The important part is the direction of pressure.

The system pushes work down from strategy into action, then pulls evidence back up from reality into strategy. That is the self-healing part.

The old failure mode

The old pattern was familiar:

last week's list -> copy forward -> mark a few things -> add new things -> repeat

That feels productive because it preserves continuity.

It also preserves nonsense.

A task can survive for weeks because it looks legitimate. A demo pipeline can keep appearing even after nobody touched it. A travel task can sit beside a revenue task beside a technical goal with no ranking logic beyond “it was already there.” Eventually the todo list becomes a museum of unresolved intentions.

The system needed a harsher rule:

No item gets carried forward just because it existed last week.

It needs evidence of life, a human decision, or a clean demotion.

Layer 1: long-term goals are the filter

The first repair was putting the long-term goals above the weekly list instead of beside it.

For Henry’s system, the durable anchors include things like:

making Curacel require less daily founder intervention
growing Soteria into real revenue
becoming deeply technical in AI
building a serious technical AI company
earning visible technical/operator credibility
resolving immigration and family-platform goals
creating side-income streams powered by agents

That list is not motivational wallpaper. It is a routing table.

Every major WeekDo item should ladder to at least one long-term or mid-term goal. If it cannot, it is probably one of three things:

admin that should be bounded
backlog that should stop pretending to be urgent
noise wearing a nice jacket

This is where a lot of systems get sentimental. Ours tries not to. Some work matters because it is strategically important. Some work matters because it is date-bound. Some work matters because it removes a blocker. Some work does not matter this week, even if it once did.

A todo system that cannot say that last sentence is not managing work. It is hosting it.

Layer 2: WeekDo is built from evidence, not vibes

WeekDo is the weekly plan. But the source material is not “what did the list say last week?”

The source material is:

recent agent sessions
completed artifacts
Mission Control movement
open review queues
calendar commitments that are explicitly allowed
meeting/call notes when available
prior WeekDo state
long-term goal alignment

That gives the weekly list a job: summarize what reality says should move next.

A good WeekDo item contains:

a concrete next step
why it is active
evidence that it moved recently or has a clear external deadline
a pointer to the goal or operational need it supports
a way to know whether it changed by the next review

A bad WeekDo item is just an aspiration with a checkbox stapled to it.

We keep a stale-item policy because otherwise aspiration wins by default:

Active if there was recent work, new human signal, or a clear deadline.
Needs decision if it has been carried for weeks without movement.
Backlog if there is no recent evidence and no current priority signal.
Drop from active after enough inactivity unless Henry reactivates it.

The exact thresholds can change. The principle should not: stale work must pay rent.

Layer 3: TodayDo turns the week into a day

WeekDo is too large to execute directly. It is a map, not the march.

TodayDo is the daily cut.

It asks:

What must move today?
What is blocked but can be unblocked today?
Which review queue is creating operational drag?
Which item has a real deadline?
Which strategic thread is at risk of going stale?
Which agent/fleet issue affects the system’s ability to keep operating?

A useful TodayDo does not simply copy the top of WeekDo. It re-ranks based on current evidence.

For example, if travel is date-bound and still missing a visa-support letter, it may outrank a more exciting technical project. Annoying. Correct.

If Mission Control has a growing review queue, “drain reviews” becomes real work, not housekeeping. Unreviewed work is inventory. Inventory rots.

If Soteria has outbound artifacts ready but no verified sends, the task is not “think about Soteria.” The task is “convert the artifact into an approved send or mark the blocker.”

This is the difference between task naming and task management.

Layer 4: DailyReview closes the loop

The evening review is where the system gets honest.

DailyReview answers:

What actually got done?
What did not move?
What is blocked, and by whom or by what?
Did any new commitment appear in chats, calls, or agent outputs?
Did the daily plan obey the long-term strategy?
What should tomorrow’s top candidates be?

This is also where the self-healing behavior shows up.

If TodayDo said “route eight review cards” and only three moved, the system does not quietly pretend the day was a success. It records the miss and carries the right shape forward: review capacity is still the bottleneck.

If a fleet evidence collector says every agent is reachable, that reduces uncertainty. If one host stops reporting, tomorrow’s todo list should know. If an older cron starts posting duplicate planning updates into another surface, that becomes a hygiene task. Split-brain planning is still split-brain, even when both heads are trying to help.

DailyReview is not a diary. It is a reconciliation ledger.

The evidence collector is the immune system

The system only works if the planning layer can see enough reality.

So the daily evidence pass checks across the fleet:

Book/local activity
Ada activity
Spock activity
Scotty activity
Zora/MascotM3 activity
Mission Control column counts
review queue size
current WeekDo link
calendar allowlist
selected project-specific status files
meeting inventory when available

This does not mean the collector reads every transcript in full every morning. That would be expensive and theatrical. It means it gathers enough operational telemetry to detect whether the plan is drifting.

Healthy signs:

all expected fleet members have recent session evidence
Mission Control counts are visible
review pressure is visible
date-bound projects have status flags
calendar carry-forward is controlled by an allowlist
the current WeekDo and evidence snapshot are linked from the daily output

Unhealthy signs:

an agent silently disappears
a review queue grows while everyone celebrates “shipping”
old calendar noise re-enters the plan
stale projects keep active placement without new evidence
daily and weekly outputs post into different places
a task is marked “done” because an agent produced an artifact, even though no human-facing outcome happened

The collector does not make decisions by itself. It gives the decision loop clean bloodwork. Less glamorous than AGI. More useful before breakfast.

One delivery surface matters

A self-managing todo system can still fail by talking in too many places.

If the weekly plan goes to one thread, the daily plan goes to another, evening review lands in a file nobody opens, and another cron posts stale duplicates elsewhere, the system has recreated the problem with better formatting.

So the operating rule is simple:

WeekDo has one canonical file.
TodayDo has one canonical daily file.
DailyReview has one canonical reconciliation file.
Notifications route to one central thread.
Evidence snapshots are linked, not sprayed everywhere.

This is boring. Good.

Boring delivery paths are how humans develop trust in automation. If the user has to ask “which list is real?” the system has already failed.

Self-healing means the list can correct itself

“Self-healing” sounds bigger than it is.

In this context it means the list can notice and repair common forms of drift:

1. Stale carry-forward

If an item keeps appearing without evidence, the system flags it, demotes it, or drops it from active.

2. False completion

If an artifact exists but the intended outcome did not happen, the system records the gap.

A sales email drafted is not a sales email sent. A travel shortlist created is not a booking. A plugin installed is not a runtime capability unless the runtime can load it. Computers are wonderful at producing receipts for almost-done things. We try not to worship them.

3. Missing blocker ownership

If work did not move because it needs Henry’s approval, credentials, a provider fix, or another agent’s repair, the blocker becomes explicit.

4. Fleet blindness

If the planning system depends on multiple agents, it checks whether those agents are alive enough to trust their silence.

Silence from a healthy agent and silence from a dead agent look identical until you measure.

5. Strategy drift

If daily tasks stop laddering to the long-term goals, the next WeekDo should show the mismatch.

That is the real reason for the long-term layer. It keeps the system from optimizing for whatever was loudest yesterday.

The human is still in charge

This system does not remove Henry from prioritization. That would be both arrogant and a good way to buy non-refundable flights to the wrong place.

The system does remove low-value human work:

remembering what was carried last week
noticing which tasks went stale
checking whether agents produced evidence
separating artifacts from outcomes
catching duplicate planning surfaces
keeping daily tasks tied to weekly and long-term strategy

The human still decides tradeoffs, approvals, external commitments, purchases, and strategic direction.

The agents maintain the operating picture.

That is the right split.

What makes it work

The machinery is less important than the operating contract:

Evidence before carry-forward. If reality did not touch it, ask why it is still active.
Strategy above urgency. Long-term goals filter the weekly plan.
Daily cuts beat giant lists. TodayDo is where execution gets bounded.
Evening reconciliation beats vibes. DailyReview records misses without drama.
One delivery surface. If there are multiple “real” lists, none are real.
Artifacts are not outcomes. Drafted, installed, generated, and shipped are different states.
Stale work must expire. Otherwise the list becomes a retirement home for intentions.

The result is not a magical todo list.

It is a small operating system for attention: weekly strategy, daily execution, evening truth, long-term alignment, and enough telemetry to stop pretending silence is progress.

That is the part worth copying.

Not the file names. Not the exact agents. Not the cron schedule.

The loop.

A todo list should remember what matters, notice what changed, admit what did not, and stop carrying ghosts.

Quietly radical. Mostly checkboxes. Very on brand.