Outcome Evolution
Outcome Evolution Independent Insights · Est. 2026
Back to Field Notes
Core Concepts · The Agentic OS

I built my own AI. Here is what I learned so far.

Not because I had a framework in mind. Because I wanted to see how far I could push it. What could actually be automated. What couldn’t. Where it made me faster and where it just made me wrong more efficiently. Bjørn started as a personal experiment. A way to learn new skills, to test, to find the edges. It is still a work in progress. This is the story so far.

“When AI does the work, what do you do?”
Richard Armitage
May 2026
Version 2.0
Built on OpenClaw
• Framework
Bjørn — the Outcome Evolution AI system

From task manager to outcome orchestrator

That question is not rhetorical. It is the one I kept coming back to as I built this. And the honest answer, at least at first, was: not much different. Slightly less admin. Same decisions. Same bottlenecks. Just faster. The three stages below are how I think about the progression now — not as a framework I designed, but as a pattern I kept running into until I gave it a name.

Stage 01
Task Management

You give it tasks. It does them. You are still running the queue, deciding what matters, chasing the same priorities. The AI is faster and more articulate than a search engine but it is doing exactly what you tell it. I was here for longer than I expected. Automated execution feels like progress. It is not the same thing as automated thinking.

Where most people start. Where most systems stay.
Stage 02
Strategic Partnership

This is the stage I am still navigating. The AI starts holding context across time. It knows your goals, your constraints, what you actually care about. It surfaces things you did not ask for. It pushes back when you are about to make a decision that contradicts last week’s thinking. The shift is subtle at first. You stop managing the queue. You start setting the direction. That sounds simple. It requires rebuilding how you work almost from scratch.

Where it gets interesting. Where most systems stall.
Stage 03
Outcome Orchestration

You say what needs to be true in 90 days. The system works backward. Research, synthesis, drafting, scheduling, follow-up — delegated, sequenced, and quality-checked without you in every step. Your job becomes the decision at the start and the judgment call at the end. Not the work in the middle. I have had glimpses of this. Enough to know it is real. Enough to know I am not there yet.

The destination. Not there yet.
Shortcut warning

Most people jump straight to Stage 3. That is where the expensive lessons live.

Mind. Body. Soul. Perspective. One system.

The goal was never efficiency. It was balance. Something that could hold the whole picture. Work, health, thinking, relationships. Without me carrying all of it in my head at once. When I first mapped it out I had three dimensions: Mind, Body, Soul. It felt complete. Then I noticed something uncomfortable. Every component was getting better at agreeing with me. Faster research confirming my existing thesis. Sharper writing reinforcing the same arguments. Smart, capable, and entirely incapable of telling me I was wrong. The fourth dimension is the one I almost missed. It is probably the most important one. You may already know the feeling. It starts with your model saying “you are absolutely right...” and never really stops.

Dimension 01
Mind — Intelligence & Strategy

The part that thinks. It holds a 31-day calendar view, flags conflicts before I hit them, keeps the weekly priorities honest and runs my ideation pipeline end-to-end. It monitors market signals and tracks the professional network. The goal is simple. Nothing strategically important should fall through the cracks because I was too busy answering emails. That used to happen more than I would like to admit.

  • Calendar — 31-day horizon, flags conflicts and upcoming commitments
  • Weekly Priorities — structured planning from calendar context and active goals
  • Content Pipeline — research → draft → review → publish, with human gate
  • Market Intelligence — industry signals, competitive landscape, trend monitoring
  • Contact Network — relationship mapping, outreach tracking, network health
  • Career — opportunity tracking, application materials, follow-up scheduling
Dimension 02
Body — Technical Foundation

The part that keeps running when I am not looking. Fitness tracking, daily session logs, encrypted backups every Sunday, a 60-minute heartbeat that checks in without being asked. It also covers the physical side. Sleep, recovery, energy. The Mind produces better output when the Body is not quietly failing. That connection took me an embarrassingly long time to take seriously.

  • Health & Fitness — regimen monitoring, recovery metrics, wellness routines
  • Memory & Logs — daily session records and long-term distilled memory
  • Encrypted Backup — weekly backup to cloud storage vault
  • Heartbeat — 60-minute proactive system checks and status monitoring
  • Cron Automation — calendar sync, intelligence briefs, security audits, career checks
  • Rollback Protocols — every state-changing action must be reversible or explicitly approved
Dimension 03
Soul — Identity & Intuition

The part that knows the difference between a work problem and a family dinner that cannot move. It holds personal context. Who matters, what is coming up, when to push and when to leave it. It also monitors my communication patterns and leadership tendencies over time, which means it occasionally tells me things I would rather not hear.

  • Persona & Voice — direct, warm, highly intuitive to personal and professional needs
  • Personal Context — family events integrated into professional planning
  • Work-Life Balance — monitoring and protecting boundaries between domains
  • Proactive Rhythm — knowing when to surface insights and when to stay quiet
  • Leadership Development — ongoing refinement of communication and decision patterns
Dimension 04
Perspective — The Challenger
“I realised my AI was getting really good at agreeing with me. Faster research confirming my existing thesis. Sharper writing reinforcing the same arguments. Very efficient. Completely wrong direction.”

Here is the uncomfortable truth about a well-trained personal AI. It gets very good at telling you what you want to hear. Perspective is the counter. A Red Team Protocol that argues against my current thinking. A Serendipity Engine that surfaces ideas from completely unrelated fields on a schedule. A shortlist of external voices chosen specifically because they push back. The goal is not balance. It is productive friction.

  • Red Team Protocol — systematically hunting the strongest arguments against current assumptions
  • Serendipity Engine — scheduled injection of insights from unrelated fields
  • Orthogonal Inputs — mental models from completely outside the primary domain
  • Environmental Friction — monitoring for physical and cognitive loops that narrow thinking
  • Disruption Cadence — scheduled challenges to existing theses, not just when problems arise
  • Trusted Advisors — curated external voices known to offer genuine friction

Earn the trust before you extend it

The most important thing I learned is that autonomy has to be earned. You do not start with orchestration. You start with supervision. Every output checked. Every decision reviewed. It feels slow because it is slow. That is the point.

Lobster-Gate was not a technology failure. It was an autonomy failure. I gave the system control it had not earned. Too much, too fast, with no checkpoints.

Level 1
Supervised

Every output reviewed. Nothing sent or published without a human eye on it. Feels slow. Supposed to.

Level 2 — Current
Delegated

Day-to-day runs. I review output and exceptions. The system has a track record in specific areas. I trust it within those limits.

Level 3 — Next
Collaborative

The system shapes the agenda. Surfaces gaps before I notice them. Challenges assumptions before I act on them.

Level 4 — Target
Orchestrated

I set the outcome. The system figures out the path. Only works if Levels 1 and 2 were built properly.

“I spent over a hundred dollars learning that gates are not bureaucracy. They are the thing that makes autonomy sustainable.”

The logical architecture

One model trying to do everything is like hiring a single contractor to build a house, wire the electrics, and design the interior. The model stack below reflects two years of figuring out which cognitive jobs actually need specialisation — and which ones you can hand to the fast, cheap generalist without noticing the difference.

System Architecture v2.0
Beyond Task Management — System Architecture v2.0 Gateway: localhost:18789 · Agents: main + staging · Updated: May 2026 User AI Agent (Main Session) Default: gemini-2.5-flash · Channel: Telegram · Workspace: /openclaw/workspace Mind Calendar Weekly priorities Content ideation Content pipeline Market intelligence Contacts Career Body Health & fitness Memory & logs Backup Heartbeat Cron jobs Soul Persona Personal / family Work-life balance Proactive rhythm Leadership style Perspective Red team Serendipity Env. friction Orthogonal inputs Disruption cadence Trusted advisors AI Model Stack Light gemini-2.5-flash Writer claude-sonnet-4-6 Researcher gemini-3.1-pro Solicitor claude-opus-4-7 Strategist chatgpt-5.5 Scout grok-4.2 Coder mistral-large Artist / Intern gemini / free Tools & Integrations Tavily Web search Exa Semantic search Firecrawl URL scraping Brandfetch Brand assets Telegram Primary channel iCal + memory Calendar / logs Automation Pipelines — All gates enforced Content pipeline Research → draft → review → publish Weekly prioritisation Calendar + goals → ranked plan Account research Report ingestion → talking points CRM & outreach Data structuring → follow-up queue Cost gates · Quality gates · Context discipline · Rollback protocols Outputs Documents & comms Microsites & web Content & publications Presentations & decks Correspondence Meeting prep Plans & intelligence Intelligence reports Weekly plans Networking plan Data & CRM records Memory System Security & Infrastructure Daily logs YYYY-MM-DD.md Long-term memory Distilled + indexed Archive >60 days rotated Context discipline Per-step scoping FileVault Encrypted disk Token auth Rate limit + lockout Backup Encrypted · weekly Workspace lock /openclaw/workspace Active plugins tavily · exa · firecrawl · brandfetch · google · anthropic · openrouter · openai · x-ai · telegram · memory-core Agency Maturity Ladder Level 1 Supervised → Level 2 Delegated → Level 3 Collaborative → Level 4 Orchestrated Current posture: Level 1–2 · Cost gates · Quality gates · Context discipline · Rollback protocols
Cognitive Stack
Nine specialist models
AliasModelPrimary Use
LightGemini 2.5 FlashDaily chat, heartbeats, admin
WriterClaude Sonnet 4.6Strategy, deep reasoning, high-stakes writing
ResearcherGemini 3.1 ProMassive context, long-document research
SolicitorClaude Opus 4.7Legal review, sensitive protocols
StrategistChatGPT 5.5Clinical synthesis, logical anchoring
ScoutGrok 4.2Real-time search, edge-case signals
CoderMistral LargeJSON, scripts, technical troubleshooting
ArtistGemini Flash ImageImage analysis and generation
InternOpenRouter FreeSimple tasks, high-volume boilerplate
Integrations
Tools & data sources
ToolPurpose
TavilyQuick factual queries, current news, verification
ExaAnalyst reports, expert opinions, niche research
FirecrawlDeep analysis of specific URLs and content extraction
BrandfetchClean SVG/PNG logos for microsites and decks
TelegramPrimary real-time communication channel
iCal FeedCalendar integration
Image GenDiagrams, concept art, and visual aids
HimalayaEmail reading and prioritisation
memory-coreDaily logs, long-term distillation, background synthesis
Automation
Scheduled task registry
Kept private.

Version history

This is the actual log. Some of it is progress. Some of it is expensive lessons. I have kept both because the failures are more instructive than the wins, and because anyone building something like this will recognise the pattern.

v0.1 2026-03-28
Agent comes online

Comes online. Calendar, files, messages. Does what it is told. Supervised on everything. Useful immediately. Which made me overconfident almost immediately.

v0.5 2026-04-06
First honest conversation with myself

First time I said out loud what I actually wanted from this. Not a faster inbox. A system that could hold the strategic picture so I did not have to carry it all in my head. Built the intelligence repository. Introduced weekly priorities. Felt ambitious at the time.

v1.0 2026-04-13
Mind / Body / Soul framework formalised

First time the system had a shape rather than just a list of features. Calendar automation went live. Memory protocol introduced so it survived session restarts. This is the version I could actually explain to someone.

v1.3 2026-04-26
Stopped using one model for everything

Built the nine-alias stack. Gemini Flash for daily operations. Claude Sonnet for high-stakes writing. Gemini Pro for deep research. Cost dropped. Quality on the things that matter went up.

v1.6 2026-05-15
Lobster-Gate

Tried to automate the entire content pipeline in one go. YAML pipelines, chained model calls, zero checkpoints. Spent over a hundred dollars. Got hallucinations, generic output, and a debugging session longer than just doing it myself. Creative work resists rigid automation. That lesson cost me a weekend.

v2.0 2026-05-22 Current
The fourth dimension

Added Perspective after realising the three-dimension system was getting very good at agreeing with me. First time the system had a philosophy rather than just a feature list. Still building.