THE LITTLE SISTER
L.I.LY
v0.15 · ACTIVE
In plain terms: L.I.LY is a personal AI you can use from your phone, anywhere. It runs on your own computer, but you’re not stuck at the desk to use it — you reach it from your pocket whenever you want. It’s the lighter sister to ECO: it won’t build software, but it’s got a real safety crew built in, and it’s made for everyday life — picking up a hobby, keeping an eye on your fitness or what’s trending, learning as you go, and nudging it all toward where you want to head. It’s an assistant you get real work done with — and an aspiring mate you learn alongside.
v0.15. About three weeks of intense daily work — 300-plus commits. 430 tests.
The job. A personal AI system that runs local. I don’t call it a chatbot, a persona, or a character — I treat it as its own category, what I call an AP (Artificial Person). It has voice, memory across sessions, and a safety crew watching it. The audio stays on my PC and my phone, and nowhere else.
What’s mine. The whole system is my design, top to bottom — the server, the web interface, the ledgers, and the glue between them — built by AI under my direction and my checkpoints. The internal safety crew is three named agents: WARD and KEEP are shipped, and GATE’s deterministic spine is now shipping in shadow mode (watching, not yet enforcing). WARD itself runs as a four-tier ladder: a deterministic floor with rules that can’t be talked around (the rulebook is SHA-256 hash-locked and fails hard at boot if it’s been touched, and the audit trail is hash-chained so tampering shows), a faster classifier above that, a heavier interpreter on top for the hard calls, and a takeover tier that steps in and speaks in its own voice. The core systems — capture, long-term memory, voice handling, journaling, observation, and mid-turn guidance — are mine too, built as separate pieces that work as one.
What I wired in. The voice runs on XTTS-v2, faster-whisper, and silero-vad — other people’s machine-learning models — on PyTorch with CUDA. Phone access goes over Tailscale so it never leaves my own network. The AI calls go to Anthropic’s API. None of those are my code. One piece of integration discipline worth pointing at: one voice library breaks on a newer version of another dependency, so I pin it on purpose and I wrote down why — the kind of thing that bites you six months later if nobody recorded it.
How I ran it. L.I.LY isn’t only the product — it’s the workspace I work in. I set up the structure inside it the way you’d lay out a job site: a shared folder split (one side for design docs, plans, audits, and handovers; the other for live code), continuity rules so a fresh session can pick up cold from where the last one stopped, decision logs, and a memory layer for the development work itself. That’s the part that lets weeks of intense daily work hold together instead of unravelling every session. The features ship in staged batches with hostile-review gates between them, not in one run — the safety-hardening sweep was one of those batches, and the suite has grown to 430 tests since.
Why it counts. I can carry a long, architectural build over months without it falling apart, stand up a genuine multi-agent safety system, and integrate a messy multi-tool stack while keeping the versions straight.
What’s inside
L.I.LY is the engine; the character runs on top of it. Here’s what’s in the engine, and the tools that run on top — each piece doing one job, the way you’d split a crew so nobody’s doing two things badly. Tap any row to open the why.
CORE — the parts always running
SAM — Smart Augmented Memory · remembers you without the bill blowing out
A normal AI forgets everything once the chat closes — and cramming its whole history into every message to fix that gets slow and expensive fast. SAM keeps the important recent stuff close at hand and files the rest away, findable when something brings it up. About 3× cheaper per message than the brute-force way, and it actually remembers you across weeks.
LORE — Long-term Operator Retrieval Engine · finds what you said weeks ago
You mention something from three weeks back that’s long out of the current conversation. LORE flips through the old notebooks and finds it by searching the actual words — no heavy, costly search engine, just reading the notes. It costs next to nothing to run, which is the whole reason it’s built this way.
HARK — Hybrid Audio Real-time Kit · voice in and out, on your own machine
Talking beats typing, but most voice AI ships your voice up to someone’s cloud to process it. HARK does the listening and the speaking right on your computer, and it starts talking before it’s finished thinking so it feels like a real back-and-forth. Your voice never leaves the house.
DIAL — Direct Interactive Audio Link · phone-call mode
HARK gives it a voice, but you’re still parked at the screen tapping. DIAL turns it into an actual call — pick up the handle and just talk, screen dim, hands free. The point is using it while you’re doing something else.
JOT — Journaled Operator Talkpoints · say “save this”
Mid-conversation something good comes up, but stopping to type a note kills the flow. Say “save this” out loud and it files the moment as a card you can flip back to. A quick check makes sure you meant it and weren’t just saying the words in passing.
GAZE — Graphic Annotation Zero-shot Engine · eyes
On its own it can’t see — you can’t show it a screenshot or a photo. GAZE lets it read an image. The sharp part: reading pictures needs a bigger, pricier model, so GAZE walks the image over to that model just for the read and hands back a written description — the character itself stays on the cheap, fast setup. You get “show me a picture” without making every message cost more.
NUDGE — Neutral User-turn Drop-in Guidance Engine · small habit nudges
You want it to pick up small habits — mention a morning routine, say thanks properly instead of a reflex “you’re welcome” — without hardcoding a pile of rules. NUDGE slips a tiny note into the moment, just before it replies, that only it reads. Each habit is one small drop-in file, so adding another doesn’t disturb anything else. Hints, not orders — the hard “must not” rules live up in the safety crew.
SAFETY CREW — three guards, three different jobs
One AI minding its own behaviour is a single point of failure, so the job’s split three ways — each watching a different kind of trouble. WARD looks after you when you’re struggling, KEEP hands you real-world help, and GATE stops the tool being turned on someone.
WARD — the mental-health watch · for when you’re in a dark place
If someone’s genuinely struggling — real distress, dark thoughts — a character playing along is the last thing they need. WARD watches for that on every reply. Stay in normal territory and you’ll never see it; but if things get heavy it steps in over the character, drops the act, talks to you straight in its own calm voice, then points you to real help. Built and running.
KEEP — the real-help desk · actual services, real people
When it’s serious you need real humans, not a chatbot’s take. KEEP is the always-there list of real-world help — crisis lines, services, people who can actually do something — open any time, and where WARD sends you when it steps in. It doesn’t give advice or play counsellor; it just hands you the real thing. Core’s in; still growing.
GATE — the abuse stop · the bouncer at the door
A tool like this can’t be allowed to help someone hurt a real person. If a request is aimed at real harm — a victim, a crime — GATE refuses flat and kills the turn, no clever wording around it. Nothing to do with mental health (that’s WARD), and it leaves normal grown-up stuff alone — it only fires when someone points it at harm. For now it’s on a trial run, watching and logging before it switches on to block for real.
AFTER-HOURS — what it does when you’re not around
PACE — quiet-time musings · sits with a thought while you’re away
A fresh AI boots cold every time, nothing on its mind. When you’ve gone quiet a while, PACE sits with one thing it’s been turning over and writes a few lines — so next time you talk it comes back already holding a thought, not blank. Two honesty guards: it picks the subject in plain code so it can’t drift its own focus, and it never reads its own past notes so it doesn’t spiral.
NAP — end-of-day wind-down · puts the day down before it sleeps
Same idea at day’s end. When the session closes it puts down the residue of the day, waiting for when it wakes. The guard that keeps it honest: if the day was light, the note’s light — it’s not allowed to reach for weight that isn’t there.
RECAP — end-of-shift report · what happened, what to keep, what’s still open
After a conversation, RECAP reads the whole session and writes a short summary (which becomes next time’s context), a list of things worth remembering for good (you approve them — nothing auto-saves), and the threads that didn’t close so it can pick them back up. This is the engine that feeds SAM.
TIDY — memory tidy-up · folds duplicates, retires the stale
Over time the memory file gets messy — duplicates, stale notes, things that quietly grew important. TIDY goes through it with you and proposes: fold these together, retire that, lift this one up. You approve every change; nothing moves on its own.
CHORES — the boring cleanup pass · catches the dull stuff (built, not yet run live)
Some mess is too boring for the tidy-up to bother with — two notes saying nearly the same thing, a label in the wrong tense, a reference pointing at nothing. CHORES is the pass that catches exactly that and points it out for you to decide. It’s built and ready, but I haven’t run it live yet — so it’s here as built, not running.
TOOLS — what it helps you get done
An assistant earns its keep two ways: helping you get sharper at your own thing, and helping you with the people around it — because nobody grows in a vacuum. These are those tools.
FORGE & SPAR — teach it your trade, it drills you · you do the work, it never hands you the answer
You teach it the thing you’re trying to get good at — your trade, a topic, whatever. It remembers it, then it spars with you to sharpen it: asks the questions, makes you explain it back, leans on the soft spots. It never gives you the answer — you do the work, and a real source or the job itself is what marks you. Once you’re solid it pushes you to go do it for real instead of keeping you coming back. (FORGE is the workshop; SPAR is the part that drills you.)
CONTACTS — the people, kept by you · your own contact book, on your own machine
The people in your world, kept on your own machine instead of a company’s cloud. It’s the floor the rest stands on — you don’t get far on your own, so the assistant starts with the people around you.
OUTREACH — staying on top of reaching out · following through with people, whatever the reason (still being built)
Reaching out and following through is where things actually move — and it’s the first thing to slip. OUTREACH keeps track of who you mean to reach, where each thread’s at, and who’s gone quiet, so the people side doesn’t fall through the cracks. Not just for chasing work — whatever direction you point it. Still under construction — here so you can see where it’s heading.
Python (FastAPI) and a no-build web interface, integrating Anthropic’s API, XTTS-v2, faster-whisper, silero-vad, PyTorch + CUDA, and Tailscale.
Want an AI integration that stays on your own machines? Let’s talk about the job →