Claude Code Exec Assistant + JFDI System Office Hours

I answer community questions live about the JFDI system, Claude Code workflows, building AI assistants, and more.

Alex Hillman
Written by Alex Hillman
Collaboratively edited with JFDIBot
JFDI

I ran a live office hours session and answered community questions about the JFDI system and my Claude Code executive assistant workflows.

I get into how the system is architected, how I handle safety and trust, what my daily workflow actually looks like, and how non-developers can build something similar. Unscripted and unfiltered.

Full Transcript

All right. My microphone is hot, hot, hot. That should be better.

All right, we’re going to try something new today. I haven’t done a live stream, I think, in like five years. Probably since I published my book, the Tiny MBA, where we did - it was during the pandemic - we used live streams to do like book tours and stuff. So I’ve got the rig, I fired OBS back up, and we’re going to try some stuff today.

If you are hanging out and you want to jump in the chat, over on YouTube is where I’m at. I’m going to try and keep an eye on chat. As always, be a good citizen, be kind to your fellow chat mates and of course to me, otherwise I will kick you out. No tolerance for basically - ask good questions, get good answers. That’s the promise here.

If there is anybody in the chat, I would love just a note that things actually look good and sound good. I’m not 100% sure. And actually, while we’re doing this, I’m just going to do a quick test from my phone and make sure this is working like it’s supposed to.

Looks good. Sounds good. All right, cool. I’m going to shoot a quick text message and then we’re going to get started. For those of you who are here, looks good, sounds good, thanks Victor Summer. Bash, awesome, tuning in from South Carolina. Super interested in what you’re building, diagnosed with ADHD, trying to build solutions with that in mind. Awesome, Tankmaster.

I got my ADHD diagnosis a couple years ago. The systems that I’ve built here are very much for my brain, which is an ADHD brain for sure, and who knows what other kind of neurodivergence. The thing that really crushes me on the day-to-day is any kind of open loop where I do a thing, but in order to do the next thing, I’m waiting on somebody else or something else to happen. Those open loops, I have a very hard time letting go of them because I worry that I’m going to never come back to them. I’ve had a really hard time finding ways to make sure that that doesn’t happen. It’s a little bit the object permanence part of ADHD, but it’s even worse than that, where the stress of the open loop kind of makes it impossible for me to think about anything else. What the system has become for me is a place for me to put things and trust that that won’t happen. Something that I’ve never quite had before.

So, I’m going to shoot a quick text and then we’re going to get started. I’m going to explain what’s on my screen because it’s going to get a little bit meta here. I actually used my tool to prep for this stream and I want to show you how I did it, because it’s pretty cool. I got to text a buddy who I was supposed to call in a little bit.

You might be asking yourself why don’t you have your assistant text for you. That is one of my hard rules. I don’t use this tool to send messages on my behalf. We’ll talk more about that when we’re talking about ethics and values and bounds and stuff like that.

All right, let’s get this started for real. I see we’ve got a handful of folks here. Let’s check in with the chat again before I get started.

[Chat]: Very similar to what I’m dealing with, the ability to build bespoke solutions is one of the craziest, most exciting unlocks.

Yeah, if you’ve been following along with my tweets the last couple of days, that’s really kind of been my realization. For some context here, I do technically have a background as a software developer. I haven’t been paid to write code in like 20 years. My job when I first got started was as a web developer, so my background is more the front-end stuff, HTML, CSS, and JavaScript. And then I put it all down 20 years ago and really just used basic skills to build static pages for my businesses and occasionally like tooling and things like that. More stuff like Zapier. What it comes down to is I have a programmer brain, I can think in systems, but writing code was never a good use of my time. Especially as a business owner.

Claude Code has kind of collapsed that for me, where before I would have to spend a lot of time to figure out a prototype. I could do it, but it was just not a good use of my time. Or I’d have to work with a developer, which I actually really like and will still hire developers for sure. But often the feedback loop is really slow, because I’d have an idea of what I want to do and I’d have to do a bunch of work to write down what that is, give it to somebody, hope they understood it correctly, hope I asked for it correctly, and then get things back, and the iteration cycles were just so slow for me to even know if the thing was worth building.

The thing that’s got me so excited here is the fact that I can really quickly prototype something with my own hands, sort of being extended by this robot. It’s pretty nuts. I also want to say the ecosystem of open source is in such an interesting place right now because it is in many ways so essential to what’s going on, but it’s also, I don’t want to say crushed by what’s going on, but I’m seeing this stuff with Tailwind and whatnot. A lot of things are changing and I’m trying to pay a lot of attention. I want to talk more about that, sort of the ethics side of the conversation. Stuff I want to share.

So what I’ve got up on my screen right now is the JFDI system, at least part of it. I wanted to start here by showing you how I prepped for this. I used this to look at my Twitter replies and my YouTube comments from the last six weeks as I’ve been sharing this stuff, and pull all the questions, including stuff that I’ve replied to as well as what I haven’t yet replied to. I said, “Hey, go get all that stuff and bring it back for me.”

This is a good example of the prep work that would normally stop me from doing a thing like a live stream. I’d probably have to sit down for an hour, probably more than an hour, go back through all these tweets and replies, open a bunch of tabs, read through them, pick which ones to respond to, kind of deduplicate them, and then give myself an itinerary or a road map. That could easily take an hour, an hour and a half or more before I even sit down to press record. Instead, I loaded up my JFDI system, opened up a blank chat, and said, “Hey, use Twitter.”

Right here the Twitter is actually linking to a Twitter skill. That Twitter skill is wrapped around this amazing little CLI tool. CY on Twitter, who’s doing the Claude bot stuff. His GitHub is full of these amazing little CLI tools for popular stuff like Twitter and all the Google apps and things like that. It’s already authenticated to my Twitter and it’s got the skills to work with Twitter like an unofficial API effectively. So I don’t even get hit with the API limitations. It loads up the skill, goes looking through my replies, finds stuff, and within 90 seconds brought back a summary of questions for today’s live stream. Again, that would have taken me an hour and a half. And I don’t think I would have done as good of a job. I easily would have missed stuff.

So it pulled back questions and who asked them. Let me pull up the bird GitHub repo. This is the bird repo I was talking about. This is super easy to install, very very powerful. If Pete’s not on your radar, he’s doing, the Claude bot stuff I think a lot of people know about, but I don’t know how many people have gone to his overview page. All these projects are these amazing little CLI tools, and it feels like the glue to pull all this stuff together. If you haven’t already checked out his repo, it’s full of really really good stuff. I’ve benefited from this tremendously. Pete also seems like just a really nice guy who’s fun and interesting. Obviously we haven’t met, I don’t know the guy, but Pete, thank you for all these contributions. This has saved me incredible amounts of time.

So it pulls all these down, and I said, “Great, include the links with each one.” And it went ahead and did all that. And I said, “Cool, save these to a file.” I have a meetings section, we’ll get into that in a second, so basically put it there. That’ll be my road map. At minimum that gives me like a place to stash things for a little while.

Then I was like, well, I had a bunch of great replies and questions on YouTube as well, which honestly I’ve never really had any kind of feedback loop on YouTube before. That was new for me. And I had actually built another tool for my coworking community over the last few days that used the YouTube API. So I was like, I know you’ve got a key for that, go hunt down those comments and basically do the same thing. So again, this is all within maybe five minutes. In between I ran and made myself a cup of coffee or something like that.

So it goes and pulls the comments, and I was like, “Hey, is that all the comments or just the comments that I haven’t replied to?” YouTube has definitely built some decent tools for folks to manage their comment section, but especially for somebody who’s a total newbie like me, I’m still very much learning there. I pulled all the comments and said, “Hey, can you deduplicate them?” So now I’ve got replies on Twitter and YouTube, and I said, “Cool, find the questions that have been repetitive, deduplicate them, but make sure we get credit from each person.” So you can see here’s something I got from two people, three people, all YouTube, so on and so forth.

It’s not only doing the work, but it’s making it really easy for me to start loading this stuff into my human working memory and start thinking about this while I’m working on whatever my next steps are going to be. This is such a different workflow from what I’m used to, and it’s pretty nuts.

So I said, “All right, this is in here, 35 comments, 24 unique questions. Biggest clusters are around open source and GitHub, how to build tutorials, Agent SDK versus Claude Code, hosting and architecture.” Interesting. And I said, “Can you put together like a logical progression for me to cover?” Because again, that’s another thing I would want to do if I was doing this all on my own.

I rarely expect this to give me exactly what I want, but if it can give me 85% and I can do the last 15% massaging, either manually or through back and forth with the assistant, which I’ve named Andy (which is a bit of an inside joke for Indy Hall, Andy is the name of the actual agent, the main executive assistant agent), so I ask Andy all this stuff, and it pulls back this progression. It goes through the problem, why I built this, the architecture overview, Claude Code versus SDK, the memory system deep dive, building native apps with Claude Code (which is something brand new to me, I did that in the last couple days), the Smog Twitter bookmark tool that I open sourced around December/January over the holidays, and then stuff about safety and guardrails and how to get started.

This is a good flow. It logically makes sense. I already had a couple things that I knew I wanted to cover. I mentioned that in the tweet I linked to when I came here. And I said I want to focus on these things, and it just kind of reworked it. So this is kind of how we’re going to flow for the next little bit.

Like I said, I don’t know exactly how this is going to go, but I figured if I was going to record videos, maybe it made sense to just do that on a live stream, get some live feedback from chat. Ask questions, please ask questions. And maybe I can chop up this video afterwards into smaller bits to share. That’s kind of the idea. As we’ve got more people flowing into chat, that’s really cool. Everyone who’s joining us now, say hey, let me know where you’re tuning in from. And also, what questions do you have about the Claude Code system I built and all the things around it?

Cool. It’s kind of getting used to my control center flow here, what’s open and where and all those kinds of things, especially when I change windows and certain things just seem to vanish. OBS is a funny thing. Anyway, all right, let’s get into this.

Let’s start with, actually, I’m going to copy this out so I don’t lose it. I’m going to kind of flow with it. I’m going to do one of these at a time, use the chat to have it tell me what to do next, which is kind of the whole point of the whole system. One at a time, and I’ll keep coming back to this and open up new windows as we go.

So let’s start: why bespoke software? We were talking about this as the stream got started. Even though my background is in software development, building bespoke personal software was just not a good use of time. Especially as a business owner. I have to think about every minute that I’m spending, and even if I can do it, am I the right person to be doing it? I run a very small business. In all of them it’s really just me and usually one other partner and maybe a handful of contractors, but I make a lot of intentional choices about what I’m working on, based on who else I’m working with and also whether or not I want to be able to do it fully autonomously or not. That’s part of my decision matrix.

Building bespoke software now, why do it in the first place is the question. The two answers for me: one comes down to the fact that I’ve already got workflows. We’ve lived through this time where most of the time we buy or use a piece of software because it does a few things we really love and everything else is okay at best. Sometimes I’m paying quite a bit for a piece of software and I only use 5% or 10% of it, especially business software. The idea of being able to have just the software I need, as well as software that I can extend as my needs change, is a different equation than it was six months ago. That’s wild.

The other part of it, the question here alludes to this, about whether the system accounts for capacity, time management, and setting realistic workloads, is that personal software can be different and does feel different, because no one understands your problem quite like you do. It’s not that you understand it better or worse, but you understand it because you feel it very viscerally. The possibility of building software where the feedback loops of “I do a thing in a certain way and it feels good or feels bad,” previously I just had to live with it. But now I can say, “That feels good, let’s make the way we did that the way we do other things.” That’s kind of revolutionary in a weird way. Or, “That feels bad, I don’t even know what the other options are,” or it lets me dream up what would feel good. That has really been the main motivator of what’s driven this entire custom tool chain and approach that I’ve built.

The direct answer to the question, does the system account for capacity and time management, setting realistic workloads, is something that I’m working on. What’s being referred to specifically is the project management side of the JFDI system, which I’m going to open up right now and show you what that looks like.

So if we come in here and we go to project management, here’s what’s interesting. This is a good example of bespoke software that was built with AI and uses AI, but not necessarily for interacting with it. As I was building this out, I know people are hooking this into Linear and any manner of other project management tools, Trello, Kanban boards, Notion, and whatnot. The truth is I’ve never had a tool that really stuck for me.

I think the reason is nothing has really worked the way my brain works. If it was designed for a one-person kind of workflow, it was designed for a person who works on projects. The difference is I didn’t need a personal project management tool. I needed a multiversal project management tool. What I mean by that is I’m a business owner, I work in a bunch of creative businesses, I work on different things, but those things are often connected in terms of the people I’m working on them with, or there are some shared elements. So they are both individual universes unto themselves, but there’s also stuff that’s shared across them and they need to stay aware of each other. The best personal project management tools I’ve ever used didn’t account for that.

What I decided was rather than hook into an existing project management tool and be disappointed, I would build something that actually made sense for my brain. This was one of the first things where I was like, “Oh, this is a totally different world.” Basically what I did is I applied my interview workflow. Some of you may have seen this before. Let me actually open up some code here. I’m going to open up Cursor real quick. While I do that I’m going to quick check tweets.

I’ll tell you what, I’ve made it through this journey using a lot of tools, and Cursor was really where everything started for me, which I think is true for a lot of people. And boy, has this tool fallen apart. I feel like every time I load it, something is in a different place. This is great TV, folks.

There we go. You may have noticed a little screen that just popped up. My whole system is hosted on a Hetzner VM, because it’s affordable and reliable. And then I use Cursor, and at some point I’m just going to ditch Cursor and go back to VS Code because Cursor has gotten so freaking bloated. You’ll get the general idea.

So I was in here to find my interview command. That’s this. I start every new project effectively the same way, with this command. This has evolved over time, especially since I built the project management system we’re about to get into. This is a little workflow that uses the ask-user-question tool. Initially it was just dialogue. It asks me questions about things like the tech and the architecture. A big thing for me is that a lot of my workflows aren’t building software, so it’s also asking about the people involved and the human workflows and things like that. What are the fallbacks? My workflows go beyond what I’m seeing a lot of programmers do, because they also consider business operations, other team members, their needs, the communities that I work with and serve. It’s a little more human-oriented. The truth is that just changes the output. It doesn’t change what it’s capable of. It just changes the orientation, I think.

Another thing here is I have it ask me one question at a time so I don’t get overwhelmed with like 10 questions in a row. So I used a workflow like this, and basically I spent about an hour with it and I said, “Hey, I’ve used every project management tool under the sun, every task management tool under the sun, every knowledge management tool under the sun, and I’ve had the same problem with all of them. They’re effectively a junk drawer that I then have to go in and keep organized, which is a pain, and I don’t have the time or aptitude for it. But also I have this kind of weird world where I run multiple businesses and work on lots of things that need to be aware of each other, but also if I look at all of them at once I kind of get overwhelmed.” That’s probably a familiar feeling.

So with about 45 minutes of back and forth, we came up with a spec for what you’re looking at here. I’ve got my 30,000-foot view, which shows the spaces that I work in. Andy, again, the name of my agent, it’s become more of a system since I built this, but that’s where that starts. Indy Hall itself, Stacking the Bricks is my education business with Amy Hoy, 10,000 Independents is an organization that does advocacy work for solopreneurs here in Philadelphia, and then personal stuff and things like that. That’s the 30,000-foot view.

Then work streams is this thing that cuts across all of them and lets me see the people and things that are interconnected between them. We’ll talk about that another time. Projects is pretty straightforward, broken down into active focus, on deck, growing, and on hold. Some things that are cool here, I found myself going back and forth between projects and detail a lot, and it was annoying. I was like, I kind of want to see what the first couple things are, so I built this little peek view that I think is pretty cool.

I’m going to keep an eye on chat. I see somebody mentioning dropped frames. I’m not sure, I don’t see anything on OBS right now, so hopefully it’s just one person. If that is an ongoing problem, we’ll look into what we can do to fix that. I’m going to keep going though.

So projects, and then this is really what I was working up to, because remember the question here was about, the question was, does this thing account for workload and stuff like that.

The answer is: kind of. On the “Now” view, what the Now view does is it lets me see any overdue tasks. I’m still dealing with a ton of stuff that’s overdue from the holidays. Overdue tasks as well as the most important next task, if there aren’t anything overdue. So when I come in here, basically for every project and universe I’m in, there is one thing to look at unless it’s overdue, in which case there are several things to look at. I would like to make that better so I don’t get overwhelmed by all the past stuff.

You’ll notice up here at the top you’ve got these filters: quick, deep work, people work, creative, and so on and so forth. And the task items have those tags as well. So in the whole interview process, I basically gave it some criteria around the kinds of things that I typically do. One of the things that AI does for me in the system, because what you’re looking at here, this is just good old-fashioned CRUD software. It’s polished, it’s nice, and it’s custom, but it’s just create, retrieve, update, delete. There’s no AI powering this.

The one thing that AI is currently doing for me very well is when I create a new task, I run the task and its contents and the context it fits inside of against Haiku and my rule set. It does a pretty good job of assigning the correct energy type to each task. Why that’s cool is that means with no effort from me in terms of categorization, unless I notice something is particularly wrong, in which case it’s as easy as clicking and changing it, I can sit down and do some work. Depending on how I feel, if I’m just getting started, I might need a quick win just to build up some momentum. Or maybe I’m already in flow and I just finished some stuff and I’m looking to stay in flow. This lets me see what should I be working on right now that’s a quick win, or what requires deep work, or what is people work.

I find that batching stuff makes sense, but the way people batch things never made sense to me, because within one cluster of batched tasks they may not require the same kind of energy. Personally, the way I approach work is very much as an energy surfer. I don’t love tasks being assigned to me at a specific time. I’ve spent a lot of time engineering my world so that deadlines are kind of squishy most of the time. The way I work is I try to match my work with my energy. This tool has allowed me to match my work with my energy very very easily.

Now the next thing I’m working on is a project management agent that is kind of running around inside of the system. Ideally what I’m trying to get to is something where on some interval it’s not only looking at the project management system, but it also looks at my inbox, looks at what I’ve completed recently, looks at my git history. I might go crazy and have it start taking screenshots of my desktop to see what I’m working on so it can get a sense of what I’m doing throughout the day that isn’t explicitly marked down in other software. And then the pattern we came up with is surfacing or syncing. There are things I need to be doing, but if there’s not a dedicated deadline to it, how do I decide what to do? I kind of want to give this thing a rule set and see if it can surface things I should be working on next, as well as sync things that may not need to be the next priority, and do that in real time based on what’s happening in my inbox, on my screen, in my calendar, and all those kinds of things. That’s the next experiment on top of the energy piece.

I wanted to share the energy piece because I think it’s a pretty cool answer to the question of whether or not this tracks time. It’s not tracking time, but truthfully I don’t track my time. I don’t bill my time. So that’s not particularly useful for me. I like being aware of my time, but I’m not optimizing for time as much as I am for quality of life. And this is designed very much for that.

So let’s go to the next question. While that happens, I’m going to check my Twitter replies. I’ve got a couple of folks asking about it. I’m going to boost this as well. “If you’re interested in my JFDI AI exec system, I’m live streaming right now.” Boy, I think I misspelled that real bad. That ought to get some engagement.

Before I dive back into the next question, let me catch up on chat a little bit.

[Chat - Jarb]: How much does it cost to keep this app going each month?

Great question. I’m currently on the $200 a month Max plan. You do not need a $200 Max plan to use the tool every day. I’m pretty sure you could run it on the $100 a month plan if you were just using the system every day and not building the system every day the way I have been.

So not only am I running the $200 a month Max plan, I actually exhausted it in the last week. Back in November and December, when I was in my heaviest build periods, I was ripping through, I was using and building the system so fast that I was hitting my weekly usage limits on a regular basis. In December, I think I spent about $2,000 in overage above the $200 plan, which is worth every penny, but it was a pretty wild bill to sit with for a minute. Definitely worth it though.

To avoid that scale of overage, I recently set up a second Claude Max account. I will say this: the difference between Claude Pro and Claude Max, the $100 and $200, is not doubled. I think it’s supposed to be 5x versus 20x. I don’t love how opaque it is, but the $200 a month plan is just an insane value. One of the things I did set up here, I’ll do this in another screen, in my admin I built this token usage view that lets me get kind of a snapshot. Maybe this is a better way to answer your question, because I was getting that bigger bill and I was like, “What am I even spending this on?”

This is a little UI built on top of a command line tool called CC Usage. This little CLI tool basically uses the JSONL files, the log files from Claude Code, to do some pretty powerful cost analysis. It can do things in real time, which is less useful to me, but it can also do these pretty accurate cost estimations. So you can see over the last week my daily usage. These are not the actual dollar amount, this is the token cost for Anthropic, but again I’m on a $200 a month plan. You can see in the last seven days I got over $1,200 worth of tokens inside of my $200 a month plan. If you look at my last 30 days, yeah, if I was paying API fees, I’d be paying over $7,000 in the last 30 days for tokens. The $200 a month plan is a freaking steal. I’m happy to pay for two of them at this point.

You can kind of see some interesting patterns. Christmas Eve, we were traveling for the holidays, and Christmas Eve ended up being like one of my biggest build days ever. I was building out some just massive stuff. I’ve got my usage by model. As you can see, this is heavy on the Opus. The fact that it’s even $200 on Haiku is pretty funny to me, because I use Haiku all day every day for tons of under-the-hood analysis, synthesis, and things like that. It keeps track of things that could consume tokens, I do a lot of progressive file include stuff, so it helps me see what is being referenced the most often so I can look for things to optimize there, as well as top tools and my most expensive sessions. Nothing particularly surprising in here, and I don’t remember what’s in all of these, especially if I didn’t give it a friendly name. I’m not going to click into them because I don’t want to possibly leak anything.

Generally speaking, I think if somebody is at standard day-to-day usage, and I won’t say “once I’m done building” because I’m not sure that’ll ever be the case, but standard day-to-day usage, I’m 100% confident it can happen inside of a $200 a month plan. It might even be possible inside of a $100 a month Claude Code plan. I don’t have an easy way to separate day-to-day usage from new feature building right now, if that makes sense. But that’s a great question from Jarb, tuning in from Bath, UK. That’s really cool. Anybody else who’s tuned in recently, I’d love to know where you’re joining from.

[Chat - Damidina]: Can you run a model from Hugging Face too?

Great question. My understanding is yes, but I haven’t done it yet. This is one of those experiments I want to do sometime probably in the next few weeks. I’ve seen a bunch of projects that kind of let you proxy Claude Code through another model, so there’s no reason to believe why that couldn’t happen. I just haven’t done any experimentation with how performance and quality change when you switch models. Because I use this tool every day and I’m focused on making it work really well for me, I can afford the $200 a month plan, so the local model piece is less about money and more about not being reliant on the model providers. Them changing stuff, which is obviously already happening and can happen at any time, as well as I don’t love sending all of my data to them all day long.

For me it’s less of a cost cutting thing, because $200 a month in the grand scheme of the businesses I run is a rounding error. But from a privacy, security, and durability perspective, it’s a destination I want to be in.

And I think, back to this, I want to talk a little bit in this stream about the ethics and values of these tools as well. Speaking personally, I think every giant tech company is problematic in one way or another. You’ve got to choose on a spectrum. I’m personally pretty against using OpenAI. I find OpenAI and Sam Altman a pretty scary combo personally, based on all the research that I’ve done. Not that Anthropic is perfect either, they’ve made decisions I don’t like, but I don’t consider Dario a scary dude. I consider Sam Altman a scary dude with power in his hands.

That’s my own decision. I don’t project it. You decide for you. I think it’s important that people are informed. I do care what decision you make, but it’s more important to me that you make a decision that’s informed than just accept these tools for what they are. In the same way that we’ve got a lot of folks that are kind of blindly rejecting these tools, we’ve got people that are blindly accepting them. Both ends of that spectrum are not good, and the best version is going to be somewhere in the middle.

Let me answer a couple more chat questions before I get back to my planned questions. Tuning in from Toronto. I love Toronto, by the way. For those of you who don’t know, I’m in Philadelphia, but Toronto is a city that I love. I’ve got a lot of friends there and I’ve had a really good time there. Every time I’ve been, Toronto reminds me a little bit of Philadelphia. I’m also a former snowboarder, so I see more Canadians here. Your mountains are better than our mountains. I will give you that 100%.

Some folks from UK. Is that John Hilton I can see from the avatar? I can’t quite tell, it’s a little small. But if that’s you, John, it’s good to see you.

More Canada. [Chat]: Haven’t fine-tuned, but some Qwen models surprisingly good.

Yeah, I haven’t really spent any time playing with non-Anthropic models, but I have spent most of the last year watching the evolution of them. My firm belief is that whatever the frontier model is, like an Opus, in six months we’re going to have local models that are just as good if not better. The speed at which they’re catching up is incredible. My hope is that I can run this with local models or even a combination of local and frontier models that are hosted. I’m cool with that at some point.

Awesome, John, I’m so glad you’re here. John, if you decide you want to grab the YouTube link and share this in the 30x500 Discord or Slack or whatever it is, that would be cool with me. Totally up to you though. Just keep me on track.

Good questions. I’m going to come back to where we left off. The next questions are in the personal assistant layer. I had a question from a few folks asking about where the assistant is hosted, is it fully custom made, running Opus? I just answered some of those questions already.

For those of you who are just tuning in: when I started this, I was running everything local on the same Mac Mini that I’m streaming from now. And then I would push everything up to GitHub, then I’d grab my laptop, go to work, pull everything down, push everything up, all of that back and forth. I’ve seen a ton of people do this too. I was using iTerm, which is a pretty awesome Mac terminal app that works both on desktop and on iPhone. So I was SSHing into my server and doing stuff there.

But the crazy scroll glitch in Claude Code, that was really the big thing that pushed me into building this custom tool. I was using it so much that the glitch was literally getting in the way and breaking things for me. So I was like, how do I make it so I can do this from my phone as well as multiple devices without worrying about things getting out of sync?

The answer was rather than SSHing into my Mac Mini from my phone, I would put everything on a hosted server that I could lock down. That server is running on Hetzner, but you know, I use Digital Ocean as well for other things, it could be anywhere. Just a standard vanilla Linux box. And then I install Tailscale on that Linux box. Tailscale lets me access that Linux box from any other device in my Tailnet. If you’re not familiar with Tailscale, it is one of the coolest pieces of software that has nothing to do with AI. It’s basically a tiny little client you can install on almost any computer and many devices that just have a basic Linux computer inside of them. So it can run on a Raspberry Pi, it can run on an Apple TV. The free version of it is incredibly powerful.

The way I have it set up: I install it on all my devices that I want to be able to access from all my other devices, and it creates a private local network between them. They’re all inaccessible from the outside world, but once the device is authenticated, I can SSH into it from anywhere. Not even just SSH, it behaves like it’s on a local network. The setup is so easy that the first time I set it up, I spent 30 minutes trying to figure out what I had to do next, and the answer was: it’s done. It was already working the whole time.

So Tailscale connects all my devices. From my phone, from my laptop, from my Mac Mini at home, from anywhere. And because I’ve taught my assistant to connect to a bunch of our technical systems at the Indy Hall clubhouse, it’s installed on our router too. Yesterday we had a huge turnout at Indy Hall, and some heavy downloaders and uploaders were hammering a couple of our access points. It was a weird traffic profile I’d never seen before, and I had a really hard time figuring out what the problem was. I ended up firing up my assistant and said, “Hey, go SSH into the router, go look at the logs, and tell me what’s going on.” Not only did it tell me what was going on, it told me how to fix it, which in the past has eaten up tons of time. Once I did that, I was like, “All right, cool. Now write a script that does that automatically every five minutes so it can be proactive.” So the combination of Tailscale, which lets it connect to the device, and SSH, which lets it go read all the logs I give it permission to, means it can write scripts to programmatically detect problems and possibly even fix them. That’s a real thing I’ve done in the last 24 hours. Pretty amazing.

Yes, it is almost all Opus as we discussed. [Chat]: Why headless Claude Code over Agent SDK?

That is one of the biggest questions I get. The short answer is I didn’t want to mess around with an SDK. I actually think I started this right before the SDK dropped, that might be the most honest answer. When I started using this and I was like, “How can I connect to my Claude Code?” I started looking for APIs. Yeah, there was no SDK when I started. It hadn’t even dropped yet. So I went reading through the docs and I saw that there was a headless mode of Claude. Basically, if you’re not familiar with how that works, it lets you run the Claude command on a command line with all of your flags as well as whatever your initial prompt is, and then it runs non-interactively, so this only works with things that don’t require additional feedback. But then it spits out the result as structured JSON. And so I was like, well, why don’t I just build a little thing that, when I type into a box, it sends that thing to the command line, runs it, and then catches the JSON, and then presents the formatted JSON nicely back in the browser. And that was the very first version of what you see, and honestly it’s the vast majority of how this still works.

I’ve since upgraded that from some kind of hacky JavaScript posting and SSE events. It’s now all running on WebSockets, so it’s more stable and bidirectional and things like that. But the Claude Code part of it hasn’t changed at all. And it’s not even really like a harness around Claude Code, which is in itself a harness. Harnesses all the way down. It’s really just a very simple layer of control over what goes in and how to handle the different JSON structures that come out on the other side.

But the really cool part, and you’re probably thinking, “Yeah, but that’s for like one request. What about the rest?” So when you send the first request and it gets the response, it also parses out the session ID. And so if I send the session ID back to my client with the response I’m presenting, that means that the next time I send a message, I can send the session ID along with my second message and it resumes the Claude session.

Here’s the best part about all of this, and this is a total accident. This is not an architectural genius moment by any stretch. But I realized the power of it. The Claude Code app that most people are running, two, five, eight instances at a time, you see people with all of these HUDs up on their screen, every instance of Claude Code that is running on their computer is using processing power and RAM. And it’s using RAM even if it’s not actively doing anything. So if you’re running eight Claude Code instances at like two gigs of RAM a piece, you need 16 gigs of RAM, and that’s before you have any kind of other operational overhead.

Well, when you use headless, and that would be true with the SDK as well, I believe, though I’m not 100% sure of that, so if somebody knows differently in the chat you can correct me, my understanding is that similar to Claude Code, it’s a daemon. It’s running all the time and therefore consuming memory the entire time it’s running. My version, whenever it is not actively processing a request, that Claude Code instance doesn’t exist. It goes to sleep, not like the instance goes to sleep, the instance is gone, and then it gets resumed the next time I send a request. And so the amount of Claude processing power I can do on a much smaller server is a real benefit, even if it’s not a benefit I was specifically optimizing for when I started building the system. That’s a pretty cool thing that I think most people don’t realize, and it’s one of the many reasons why I’m very happy not using the Agent SDK.

Now, I have looked at the Agent SDK, and I consider it part of the exit strategy to a degree. In the same way that I said I would be totally game to swap out any of Anthropic’s models for another model, a local model, a cheaper model, an open model, whatever it is, I feel kind of the same way about the Claude Code app itself. I don’t love that it’s closed source, but it doesn’t bother me either, because I’ve been able to extend it in the way that I’ve extended it. Now, if Anthropic went really off the rails and made it so I couldn’t use Claude Code the way that I am, the SDK is obviously the way that I would go.

So maybe I’ll do it in advance, I don’t know. It’s one of those things where, again, I’m not building this because I like building software. I’m building this to solve real problems for me every day. And every minute I spend hacking on the SDK to build something from scratch that Claude Code can already do is time I could spend building something that’s useful, or actually working on my business, or more specifically building tools so that I’m spending less time working and more time doing the things I actually want to do in my life. So I get why people ask “why not SDK, it’s so much more power?” And I’m like, but I don’t need it, and I’m not in this to build the tool. I’m in this for the tool to solve my problems. I have not run into roadblocks with Claude Code that the Claude Code SDK would let me solve.

As far as I can tell, the only time that will be the case is if they somehow kneecap the way I’m currently using Claude Code, taking away headless mode is the main thing I could think of, or if I wanted more detailed profiling. Maybe I start using the SDK for very specific purpose agents at some point. But every tool solves a problem. If the tool I’m using doesn’t have a problem, I’m not going to go seek a new tool just because it’s shiny. That’s shiny object syndrome, and that’s why I build more stuff than you, I guess.

[laughter]

I think that covers this round of questions. There are some sub-questions here about how I use these things. Hosting and infrastructure: we run on cheap VPS, Mac Mini, and Tailscale. We talked about that. We talked about Claude Code over SDK. I just like that Claude Code is so heavily battle-tested, better than any SDK custom agent I could build. And I get all the benefits of the new features they add to Claude Code. They just show up in my system automatically, but they don’t take away anything in my system either. So that’s the best of both worlds.

I do heavily use skills and agents and the classic slash command approach. Background tasks and scheduled jobs is really one of the big things that I think I do more of than most other people. You know, every day a lot of people wake up and manually run a command or click a button. Because I’m doing Claude Code headless, that also means I’ve been able to schedule a lot of things. Let’s take a look at how that actually works.

You could do this with effectively any cron solution, basic Linux cron, your favorite cron replacement, whatever it is. I didn’t have a preference. We ended up picking this open source project called Bree, B-R-E-E, and that is my job runner. Bree runs and handles all of my recurring jobs and tasks.

This is a reference I have for myself that the system has created for itself, all the jobs that run. These run basically all day every day. The vast majority of them, all they really do, let me get a concrete example here. I’ll do my relationship refresh script. So this is what a Bree job file looks like. It’s just JavaScript. And I haven’t written any of these. These are all templated out. There’s now a template that all new jobs follow, so the assistant can create new jobs the way I want them, and they all work consistently.

But what this one is basically doing, besides the actual running, is a lot of validation to make sure that it a) ran, b) ran all the steps it said it was supposed to, and c) that the JSON it spits out the other side can be handled the correct way. One of the workflows that works really well is if you give the model success criteria and say at the end it should look like this, and you present that at the beginning and at the end. So at the beginning I’ll say, “The goal of this workflow is to populate a JSON structure that looks like this with this information. You’re going to follow these steps, and at the end you’re going to spit out JSON that looks like this but filled with the correct details.” Then it goes through however many steps it has to go through, and at the end it spits out that JSON.

It can still make mistakes as an LLM, and that’s why validation scripts are so important. All of my jobs have validation built in. This one is making sure that the number of tool calls was correct, making sure it’s following the to-do system, and so on.

But the actually important part - yeah, here it is, it’s this function right here - it’s running the Claude CLI with stream-JSON output for validation. The prompt is prefixed with “Andy internal task,” which is just so that I can easily hide these automated jobs in my session picker system. I can hide or enable noisy tasks. Noisy tasks are things that run every 5 minutes, 15 minutes, once a day, whatever. There are certain things I want to show up there and certain things I don’t. That little flag helps me hide those.

And then here it is - this should look familiar. Run slash relationship-refresh, which is just a slash command. That’s the whole prompt. That’s literally all that gets passed into Claude Code headless, because the slash command is what has all of the real mechanics of the system. That also means I can manually run the same slash command that the automatic job runs and get identical results, which makes both debugging and just manual runs of these things much easier. I don’t have to maintain it in two places.

So that’s the whole command. It’s going to send that in, run, do its thing, and spit it back out. And then depending on what job it is, sometimes that will daisy-chain to another job. Sometimes it will post an update into a Discord. Sometimes it will be silent if things went well, but surface some information if there’s an intervention needed from me. Each job is a little bit different, but those are the ways I handle my scheduled jobs at a high level. I feel like I could probably spend an hour or two just on how I’ve architected the job system, and I’m happy to answer specific questions about it before we move on.

While I’m doing that, I’ll jump into the chat for a second and answer this question from Mike.

[Chat - Mike]: “If I recall correctly, you’re using Redis. Is there any other database types or services you’re using?”

I’m not using Redis. I’m using Postgres for everything. This is self-hosted Supabase. Supabase is great, although I will also say I’m using basically none of Supabase’s features. It was just that I wanted a nice wrapper on Postgres and it was ready and easy. At some point I might drop Supabase for good old-fashioned Postgres. In fact, some of the newer things I’ve built that are standalone are just running Postgres in their own container.

So it’s just Postgres. I’ve learned a lot about Postgres over the last year from an Indy Hall member named Chris Alfano. It took a little bit of him reminding me, but I’ve come to believe what he says is true: at this point, Postgres is almost like its own operating system, in that not only is it a database, but you can use it for queuing and job management and stuff like that. It is so good and so fast, and having one fewer dependency is great. I’m not trying to build a complex system, I’m trying to build a stable system, and that has worked really well for me. If there’s a reason that’s going to bite me later, I can make a decision then. But no, everything is in Postgres.

Occasionally I will prototype things in simpler solutions data-wise. For example, right before we got back from the new year, I wanted to rearrange my wine rack. This is a cool one.

Let me be honest, I’m not 100% sure if what I just typed is going to work. We’re going to find out together. While that’s doing its thing, I’ll tell you what I did, though.

I wanted to rearrange my wine rack. It’s kind of a mess, it’s confusing. I had great wine on it, but I never knew where anything was because I just kind of put stuff in at random. I wanted it to be easy to grab something when I was looking for something specific. And so I took all the bottles off the rack. I took a picture of the front label of every one. Then, a few images at a time, I would load those into the system and say, “Go research each bottle of wine, stick it in a database, and we’ll go from there.”

And I’m so glad this worked. So while this is spitting everything out - this is freaking wild. You may have noticed that the first thing it did is look up my wine cellar database, and it’s SQLite. I could have said, “Go create a new table in the Andy Core system” - that’s the main app, which is a Laravel app with Postgres. But I was like, I don’t know if this is even going to live in there forever. So let me make a standalone thing. It’s just a folder. It’s got a SQLite database in it that it started loading stuff into. Put all the images up in my Digital Ocean storage bucket.

Over the course of honestly more time for me to take the pictures than it did to do anything else, which is crazy. I went through, took the pictures, and it would go research the bottle, bring back the information, I would verify it, I’d say yes, add it, and it would put all that information in the database. So you’re not even seeing a fraction of what’s actually in here.

For instance, if I were to go to this - let’s do the Cab Franc - and see what other information we have about that.

[Live demo moment - autosubmit activated unexpectedly]

Oh no. Live demos. Oh, that’s right, I forgot that on this computer I have it autosubmitting. I’ll turn that off.

This is one of those tooling things. I know there are great tools out there for speech. I just haven’t played with something new probably in about a month. I think I paid 20 bucks one time for this PowerPoint speech tool and it works very nicely for me. So if there’s something you really like for voice-to-text that’s Mac or iOS, I’m interested.

But you can see this is all the information that it stored, completely automatically from a single message and label. I’m going to come back to your question about RAG and semantic search in just a second, Mike, it’s a very good one.

So every bottle got stored with really good metadata that was properly researched and then verified. But then I was like, “All right, you know what all the wine is. I have a 60-bottle wine rack and 51 bottles. What’s a rack strategy that would actually put bottles that are like each other near each other, so that when I’m going to grab a bottle for a meal or a pairing or because I have a certain craving, I can know roughly where it is without even looking it over?” And in like 30 seconds it came up with a really good rack strategy with almost no free space in the rack, which is a very hard thing to do for a human. So I did that.

Then I said, “Okay, help me put them back in.” What I said was, “You know what order I gave you the bottles. Let’s do the same order again, but for each of the bottles, I want you to show me an ASCII diagram of the wine rack and tell me which slot to put it in.” So watch what happens on my screen when I go “Show me the wine rack.”

Basically what it did is draw the wine rack, put a number in the slot, and then have a little key where the number described what bottle was supposed to go in that slot. So it knows the slots. That particular view is not super helpful for visualizing where something is on the rack, but what it’s about to do is.

And one of the things I’m thinking about is taking a picture of the rack, especially as I start pulling things out or refilling. I could take a picture of the rack and say, “Hey, cross-reference this to your knowledge of the rack and tell me what I should buy to replace it.” That’s pretty cool. So you can see what it’s doing right now, spitting out a little ASCII graph of what’s in it, how it organized it. The top row is my sparkling wine. Then fortified in a couple of overflow slots. Burgundy and Chablis, Chardonnay, Riesling, Sauvignon Blanc and light whites, Pinot Noir, mixed reds, Cab and Bordeaux blends, and then full-bodied reds. It’s not complicated, it’s not fancy, but I didn’t have to figure it out on my own. And that felt like a magic trick.

So this is a really cool application of using Claude Code for something that has nothing to do with code. A simple SQLite database is very, very powerful here.

Now, the question from Mike about RAG and semantic search, not in this particular part of the tool, but I have RAG and semantic search heavily built into the memory system of the JFDI system. I tried out a bunch of memory tools back in early December or so. Some were good, some did not work at all. At the time, ClaudeMem was the best of them, but it was super optimized for programmer workflows, and I kept finding it was not working nearly as well when I was doing things outside of coding. So I was like, well, I can modify this, or I can learn from it and see what I can build for myself. I did the latter.

I can show you a little bit about it. Right up here you’ll notice a semantic search option pops up. I have a longer video about the memory system on YouTube you can go check out. The short version is this.

And yes, Mike, my assistant is a sommelier, actually. My best friend is a sommelier and I’m a partner in a wine business, which is part of why I have this collection besides just enjoying wine. I don’t think Eric needs to fear for his job based on what this is doing, but I’m very excited to show him how these tools work so he can use them in his job. Part of what stresses him out is the amount of stuff he keeps in his head, which is truly impossible long term. That’s where a lot of his struggle comes from. If I can get him to start sharing the load with a tool like this, it will help him be better at the things he is uniquely better at, versus spending all of his cycles on everything.

So, quickly about how I’m using RAG and semantic search in the system. Part one is I have a job that every five minutes grabs the Claude session files that are on disk and sticks them in Postgres, sticks them in a row of all of our sessions. It’s not only sticking the full transcript file in there, which some might say is a dumb thing to do - maybe it is, I don’t know, it hasn’t created problems for me yet - every session gets added as well as updated. This happens not in real time, but every five minutes. This database is up to date on a five-minute cycle. You can see the session we were just working on is already in the database.

But it doesn’t just stick the session there. It does some pre-processing. It looks at the transcript and also pulls out useful metadata: how many messages are in here, how many tool calls, how many files, what files got touched, what tools got used. The full transcript and all of that is being saved in the database at minimum. This is useful as just a backup of what I’ve done, and it allows me to clear session files on disk much quicker, I think anything past the last 3 days. In truth, it could probably really be more like the last 24 hours at some point, just to keep Claude Code speedy on load.

Once that was in there, and it’s also searchable, I was able to build what I call Memory Lane. And the way Memory Lane works, this is where the vector stuff comes in.

Going back to the Claude sessions: I do have embeddings being generated for all sessions. The trouble is sessions get real long, and embeddings aren’t very useful for very long sessions, I’ve learned. You kind of need to chunk things up. That’s the best practice.

So every five minutes, sessions get synced to the main database, and then every 15 minutes another job comes along. It looks at new sessions or things that have been added to the session since the last job. It runs a chunking algorithm and generates an embedding on the chunk. Here’s how that works: I throw the session against a mix of Sonnet and Haiku depending on a handful of factors, and the model basically does the chunking for me. The model is reading the transcript and it’s looking for specific things. It’s looking for decisions. It’s looking for insights. It’s looking for patterns, commitments, lessons learned, corrections, workflows, and gaps, and a handful of other things that get used a whole lot less often.

All it’s doing is looking at the long transcript and finding a little section that matches one of those categories, and it pulls it out and saves it to a new record in a new table in the database that is linked to the transcript. So we have data integrity all the way through.

Here’s an interesting example. It noticed we were talking about Dr. Frank - I notice I have a couple of bottles - and it goes, “Hey, there’s a pattern here.” That’s actually a pretty good example of it working in a non-technical setting, where it says, “I noticed this, and that might be useful information for later.”

So a memory is kind of… what is the difference between a memory and a thought? That’s an interesting question that led me to this lightbulb moment. The lightbulb moment was actually inspired by a paper that somebody who came to one of the meetups at Indy Hall shared with me. Google was doing some research in one of their labs and realized that the element of surprise actually started creating memory-like behavior at the model level, deep inside the models they’re training at Google. I was like, well, I’m not interested in training models, but that’s an interesting insight.

Think about why that is. Think about your own behavior. You are more likely to remember something if there was an element of surprise. And surprise isn’t explicitly good or bad, it’s just a deviation from expectations. That’s why disappointing memories tend to hit so hard: there’s a deviation from expectations. So what Haiku and Sonnet are basically doing is looking at a transcript and looking for deviations from expectations, among other things, to extract out potential memories. Then it runs it through a ranking algorithm. But what it’s effectively doing, again in this example, is creating a memory that says “Alex buys multiple bottles of researchers he likes,” identifies the pattern, and records when the memory was formed and when it was saved. “Memory formed” is the timestamp from the original transcript. “Memory saved” is the time that this new entity in the new database table is stored.

There’s a longer description of what it thinks the memory is, the reasoning that led it to that decision, a confidence score, related entities that help it search through both the database and the text parts of the system, as well as the original context, which in some cases is a single line, but is usually between one and ten lines of transcript. So if it’s part of a back and forth between me and the model, that all gets stored as the original context. With me so far?

That is what is generating memories. Now there’s a great question here: how does it know what’s a good or useful memory? Out of the gate, it doesn’t. As you can see, in less than a month since I built this in mid-December, it’s generated over 4,500 memories. If you scroll through, there’s a lot of very interesting things in there. Not all of them are good or bad or useful or useless. This is kind of just prepping the database.

To tie this all back: not only is it generating memories, but we also generate embeddings, for the metadata specifically, and also for the original context. Both of those get stored in the database. Where that becomes useful is in chat.

You notice this little purple bubble, and you may notice through the rest of the demo that the purple bubble shows up when I chat sometimes. That is the other half of Memory Lane. The way that part works is I have a hook in Claude Code that, whenever I send a message and whenever the agent finishes a series of messages, and a couple of other conditions, it uses my message or its messages as effectively the query to hit the memories database, and surfaces potentially related memories above a certain threshold and within a certain context.

This is probably one of the most sophisticated parts of the system I’ve built, and it’s taken quite a bit of tuning, but it’s working very, very well. Now, this view may not stay in my app forever, but in a lot of ways it gives me almost a debug view of not only what’s getting pulled into the memories, but why. You can see just in the little bit of back and forth we had with the wine here, it pulled in these two corrections. To be totally honest, they aren’t exactly relevant here, they’re not irrelevant, but a correction is a good thing for it to know. It pulls them and injects them into the context, so if I didn’t have this UI here, they would inject silently and become part of the context the model is working with. This is the thing that makes it so I don’t have to remind it of things all the time, that’s one of the main things.

It also means I can periodically do things like use a skill that lets it search the memories. So if I just say “use memories” as part of a prompt, it will go to the database, both plain text and vector. I also have a version that’s just straight searching the sessions in case something isn’t turning up in memories, which does happen sometimes, but the system is pretty good at generating memories for most things. So I can both manually invoke it and use this automated injection system, Memory Lane, all powered by vectors.

And then, how does it know what’s good or bad? There are two parts to that answer. One is there is a threshold. This is the part I’m finding the hardest to tune, because you want to set a floor: basically, if a memory is pulled and its semantic match is below a certain threshold, it’s not going to get injected, it gets thrown away. If it’s above a certain threshold, it gets injected. But the thresholds seem to need to be tuned to the context of the conversation, and I’m still working on that.

You’ll see there’s a little bug icon here. I actually had it build me a retrieval diagnostics view. It’s not cached at all so it’s slow, but it lets me look at the search query, see how it deconstructed it, what queries passed, what did not. This lets me see what it’s keeping and what it’s throwing away, and I can start getting a sense of why.

The other tool I have is thumbs up and thumbs down. That lets me say “you picked a good one,” and it will give it a positive additional weighting next time that memory shows up in that context. And the thumbs down gives it a negative weight, so it’ll pull it potentially below that threshold. Sometimes it takes like one or two extra thumbs up or thumbs down on a memory to get it into the right band.

But it’s amazing how well it works without much more fiddling. I find myself maybe once a week or so going in and doing a 30-minute tuning session on a few things. It’s one of my favorite parts of what I’ve built here. Let me get into some questions before we continue.

Mike, I hope that answers your questions about RAG and semantic search pretty thoroughly. I’ve also seen a mention of Convex, using it sometimes because of the native TypeScript integration. I’ve been seeing Convex mentioned but I don’t actually really know what it is. It seems like a backend-as-a-service type thing. I like TypeScript, I actually learned a lot about these systems from Matt Pocock, who’s like the TypeScript guy on Twitter and YouTube. Matt’s actually a client of mine. We help him with his products, his launches, his courses and stuff. I’ve got a meeting with Matt on Monday, actually, and he’s been kind of my portal into both TypeScript and a lot of agentic stuff. But I’m curious what folks like about Convex if anyone wants to drop that in the chat.

[Chat - Department of Personal Efficiency]: “Are you open to having a 15-minute career chat or consultant-type call? Would love to see if you’d want to collaborate with my startup.”

My time is pretty limited for one-on-one calls, but if you want to send me an email, my email address is alex@indyhall.org. I need you to be clear about what it is you’d like to talk about or ask questions about. If it’s a good fit, I might be able to make time for a call, but it’s more common that I’ll want to do something asynchronously over email until or unless there’s some kind of actual engagement. Otherwise, happy to hear what you’re looking for. Email is the only channel I receive those kind of requests through. Alex at indyhall.org.

[Chat - Great Big Tree Hugger]: Hello from Vancouver.

That’s another city that I love. Loving all the Canadian representation in the chat today. I haven’t been in Vancouver in a long time, but I enjoyed myself thoroughly. I love the Pacific Northwest in general. I have a lot of friends out that way, so great part of the world.

[Chat]: “How are the Claude Code headless sessions distributed between the VPS and the Mac Mini for their executions?”

Great question. At this point, they aren’t. I do basically all of my work on the VPS session. The VPS session has SSH access to my Mac Mini, my MacBook Pro, and all my other devices. So if I need to use Claude on one of my other devices, I’m almost always, and this is crazy, almost always using it through my main assistant, because my assistant has all this other tooling around it.

Now, if I don’t need that other tooling, I will run a Claude Code session locally on my Mac or whatever. And sometimes that is necessary, sometimes it’s useful, but it’s generally not in a situation where I care about that session being in my permanent record. In theory, if I cared enough, I could write a little script that syncs them up to the VPS and they’d get stuffed into the database as well. But it happens maybe once or twice a week. Right now I’m really doing everything directly through my main agent. And the fact that my main agent can talk to these other computers, even if they’re completely different operating systems, unlocks a lot of power.

[Chat - Department of Personal Efficiency]: “You had Spark File in the sidebar at one point. What was that?”

I’m not going to open the Spark File because it’s very personal, but the Spark File is an idea I’ve been doing for almost 20 years. One of my favorite books is “Where Good Ideas Come From” by Steven Johnson. This book is fantastic. It is an anthropological look at the history of innovation, looking at periods of creativity and renaissance throughout recorded human history, and it identifies a common theme between them, which is that they were places for people to gather and converse and share half-baked ideas.

In the book, Steven Johnson creates this concept of what he calls a slow hunch or a half-baked idea, which is “there’s a there there and I’m approaching it, kind of slowly figuring it out.” Periods of explosive innovation occur when you have these slow hunches bump into each other in common spaces. That includes the cafes of London and Paris, the salons of the Greeks, coworking spaces like the one that I’ve been running for the last 20 years, and honestly to a degree shared spaces like Twitter and the YouTube chat, where people show up with half-baked ideas and occasionally they smoosh together into a fully baked idea. That’s where good ideas come from: disconnected half-baked ideas, and sometimes a bad idea sprinkled in there as well.

So how does this tie into a Spark File? Well, one of the ideas that Steven talks about is creating a Spark File, which is the place where you put all the little idea fragments that keep popping into your head. For those of us with ADHD and things like it, our brains are always very active. If you don’t have somewhere to put those ideas, it becomes kind of deafening. So for me, the Spark File is a place to put the idea so that it’s no longer bouncing around inside my head, but it puts it in a place where I know I’m going to see it later.

Because the second part of the Spark File is a review process. The way I started doing this is I would put everything in one long-running text file and then once a month I would re-read everything that I had written down. I’d look for patterns. I’d look for things that were not obviously connected when I wrote them down, but upon review are connected. I’d look for “I keep coming back to the same idea, I think I have to do something with that idea.” It’s a version of journaling, but a little less structured. It’s like journaling for ADHD people, I suppose.

The tricky part is I do that monthly, and then I do a once-a-year review. And as you can imagine, having done this for almost 20 years, those documents are impossibly long to work through. You think about what happens when your AI chat has too much in its context, it gets drunk and stupid. I can’t hold all that in my head to actually notice useful patterns anymore. So I don’t want to say I’ve stopped doing it, but the outcomes have gotten harder without me throwing away really old stuff. And now that I know how compacting works, I could do a compact-style effort on it, “All right, here’s the month. Let’s compact it down to the good stuff and carry that forward to the next session,” but I never thought to do that until literally just now.

What I have done is moved my Spark File notes into the system, and I now have quick entry. Instead of going into a text file, it goes into my Spark File database. Now I’m doing all of my entry and my reflection with the help of my assistant. It goes through and helps me notice those patterns. So the workflow hasn’t changed, but the amount of effort to get the same quality output has shifted. I was putting more energy into turning things into action than into the heavy lift of gathering in order to reflect. Now it can do the gathering and I can do the reflection. That’s how a lot of this works for me, I use the tool to do the mis en place, the pre-work, the pre-organizing that makes it easier to do the work. I get it to do the prep work so that I can do the real work. That’s what has really made the system so powerful for me.

So that’s what the Spark File is about and how I use it. At some point maybe I’ll figure out a way to show a version of it that doesn’t expose the complete inner workings of my brain. I’m not fully comfortable with that yet, but we’ll see.

[Chat - Mike]: “Cool pattern analysis, experimenting with that in code development. Memory is definitely harder lift. Lots of potential noise there.”

Yeah, I mean, figuring out signal to noise inside of the memory system, it feels like there are people smarter than me with way more experience in this space. Vector search feels like a bit of a magic trick to me. If there are folks with great resources on vector search and embeddings and things like that, I now understand roughly how they work, but I don’t feel like I have a strong enough grasp on them to really wield them well. I feel like they’re kind of a lightsaber, if you don’t know what you’re doing, you can do either a lot or nothing, and I can’t really tell how well it’s working compared to how well it could be working. I’m not really sure how to cross that gap yet. So if folks have ideas or suggestions, drop those in the chat, that would be super helpful for me for sure.

[Chat - Department of Personal Efficiency]: You’re very welcome.

[Chat - Great Big Tree Hugger]: “Thanks for all the work, inspired me to implement entities and memories.”

I’m so glad to hear that. That’s the reason I’ve been sharing this stuff. I’m not interested in this being exactly a SaaS. I see some folks asking about sharing this. I’m more deep in learning this stuff than I have been with any new technology in a very, very long time, and I’m having a fun time. I really like sharing this with the hopes that other people will be inspired to try it themselves.

And I think, to use my lightsaber analogy, there’s more value in building your own lightsaber than borrowing one or buying one off the shelf. The things I’ve learned and the understanding I have of the system are mechanically valuable to actually using the system. The exact same system in the hands of somebody else is going to be worse for two reasons. One is how deeply it’s personally grafted into my life and brain. I could pull all the personal stuff out and start grafting your stuff in, that’s a doable thing. What’s not replaceable, I think, is the stuff I’ve learned while building this. It could be useful for somebody without that learning, but I think it would be a fraction as useful as it is for me.

So I’m hesitant. I’m thinking about how I want to do this. To jump ahead to the question of whether I’m going to make it a public SaaS or keep it personal, I’ve been kind of clear from the front that I’m making software for me first, and the fact that other people are interested in it is exciting but it’s not really the goal. I think the version that’s most likely to happen is something where I try working with maybe 10 people, 10 business owners can apply to be part of a group. I think there’s criteria for who I can work best with, at least initially. Ten business owners who have a business that is not brand new, is well-established to some degree, is just themselves, maybe they’ve got one other business partner, but ideally no others, them and maybe a couple of other people they work with, but they’re the sole decision maker.

The way I would do it is something where you pay me a flat fee, and what I’m going to do is basically take the skeleton of my system, pull out all my personal stuff, and we’re going to spend an entire day together grafting my system into your world.

And then for a couple of months after that, we’ll get together as a group of all ten people and share what’s working well and what’s not. Probably have a Discord or something like that for those folks to work together. But ultimately, I want to do it in a way where folks have a chance, I want to give people a running start, but I don’t think I can give you the whole thing and have it work for you. I just don’t think it works that way.

I’ll say up front that this is probably not going to be a cheap service. This is going to be for the business owner who’s got a business, making at least six figures. I feel like a five to $10,000 investment for a person who is making $100,000 or more, I feel very confident that I can help you double your revenue or just work less using systems like this. But again, that’s not a chasm I’m crossing just yet. That is not a business that I necessarily want to run. I just have to look at it against the value of my time in the businesses that I run. And if I’m going to take time away from those businesses, I need to make sure that makes sense economically. I consider this very high leverage work. It has been for me and I think it will be for other people. So that’s who I think this is best for.

And look, if I run a few of those cohorts and we learn that this could actually be built as a SaaS or could be partially or fully open sourced, I’m open to that. But the way it operates today, I’m not. So that’s kind of the path that I see for that going forward.

[Chat - Mr. Stanton]: “Mind your tensors, don’t bother with the vector.”

I’m going to be honest, I know what those words are, but I’m not sure what you mean there, Mr. Stanton. Open to that explanation.

Do I have a Discord? Indy Hall has a Discord that is for members only. I do not have an open-to-everybody Discord right now and that is not going to happen anytime soon. The Discord that I would consider would be one that is for a specific group of people who are part of something that we’re actively working on together for your business. But email is the way that people I’m not actively working with can reach me. I will do my best. The clearer you can be with what it is that you want, I will point you to whatever I can. Just make it easy for me to help you.

[Chat - Fred]: “You were showing earlier the Bree cron scheduling ships. Do you also have a dedicated page to visually show the execution status of the jobs?”

Yes, I do. You may have noticed while we’re working in chat, these little green dots, there are five of them. When any of those are yellow or red, that’s a clue that something in the system is unhealthy.

The easiest place to show you is probably here. This admin view is kind of like all my admin systems and whatnot. It lets me very quickly do things like monitor CPU, memory, disk, and uptime of the server. I have definitely run into situations where there’s a memory leak or one of my containers crashes or whatever. This is meant to solve all those problems.

Not only does it give me a view of the scheduled jobs, it gives me a view of so much more. So I get my sync jobs view, I think this is the direct answer to your question, Fred. I get to see every session, and I have a little reminder for myself of what it does, the frequency, the last time it ran, and currently everything is healthy. If anything is anything other than healthy, I either get a warn or a down, or something along those lines. I get the appropriate colors here. I also get Discord notifications for anything that’s critical and requires intervention.

Similarly, I’ve got one for my job queue, one for my MCP servers, one for my long-running services, like the bridge between Claude Code and my client, my maintenance watchdog, and a handful of other things. The scheduler itself is a long-running process, and then my containers as well. So this at a glance lets me see everything, and what’s cool is that those are all different parts of the system but I can see them all very quickly and very easily.

And then down below I’ve got more detailed, actually, I’m going to show you some other cool things. This sync status gives me a more detailed view of all of those individual jobs. This admin view is not cached at all because I want to see real time, but that also makes it a little bit slower. I haven’t been in this page, so hopefully things don’t break.

Hey, look at that. Yo, we found a bug. We’ll just go back. Love a live demo.

MCP servers, not a whole lot to show off here other than the fact that I can see what’s running, turn them on and off, all those kinds of things. We already talked about sessions, we talked about token usage, memories. Slash commands is handy. I need to create something like this for my skills, as well as probably my agents. But what this does is I have a little job that watches the folder that all the commands are in for new changes and things like that, and lists them all in here.

Why this is particularly useful, besides the fact that it’s searchable and I can click into any one of these and view it, is that this gives me at a glance, and I have a little click-to-copy here. So if I want to share a command with folks or just open source it, this makes it very easy for me to pluck a component out and share that.

But the really cool thing you may have noticed is these counts. I have a hook that every time one of these commands runs, it increments a counter. What that lets me do is I can both sort them by things like most used and most recently used. And when I’m in chat and I do slash, I’ve got my own little autocomplete thing here that automatically sorts the most used commands and most recently used commands to the top. So again, little bits of ergonomics, Claude Code has that slash preview thing, but it doesn’t have the most recently used or most often used ones bumped to the top. And when you’ve got 71 commands, in what order do you display them in a way that’s useful? This answers that for me.

I can do the same sort of thing with skills. And I actually have a little thing I started doing recently, still a bit of an experiment, but if I type something and I’m on my third or fourth character, see that little shimmer on the text? That is it auto-detecting that the thing I’m typing is either a skill or a command. If I hit tab, it pops up the autocomplete. It basically lets me do autocomplete anywhere in the text without a slash or an at prefix, which is again just little bits of ergonomics to make using the system a little bit nicer.

I just realized my video was blocking a bunch of what we were just talking about, so sorry about that.

Real quick from the chat, no mention of open source. I’ve open sourced a bunch of things. That does not mean I’m going to open source a whole system that I’ve spent the last almost 12 weeks building and that is very personal to me. Folks that seem entitled to that work are going to get thrown out. Let’s not play that game. As there are things that are useful in here that I can cleanly open source and share, I will, and I have. That’s where the smog repo came from, that is from the system that I built for archiving from Twitter into my system for links. I open sourced a really powerful part of my session and memory recall into a project called Kuato, and a couple of other things as well.

So don’t be an entitled person thinking that you deserve access to something that I spent a ton of time on. But if there are things that are useful, the more I hear about the specific things that you want to be able to do from within the system, I’m going to keep looking for ways to open source those components.

I think the question would love one for the JFDI, that was about a Discord. Again, I’m not running open public Discords. I’m much too busy for that. But private Discords for people that I’m actively working with? Possible. Nonetheless, I see you’ve taken off for the day, so thanks for hanging out.

[Chat]: “Do sentiment scoring on chat inputs for dynamic mood adjustments for the process.”

Oh, that is an interesting idea.

The closest I have to that right now is I have an end-of-day command that I can actually pull up right here since we’re here. The end-of-day command looks at my beginning-of-day command for how the day was supposed to go and then it looks at how it went. So it’s going to look at my email, the emails that I sent, the tasks I checked off, the reminders I checked off, my git commits, a number of other things. I mentioned earlier I might build a little thing that actually takes periodic screenshots of my active working computer and synthesizes that just for the things that are offline. It includes my meeting notes, which the system also does all of my meeting prep and meeting follow-through for me. It looks at my calendar, all of those things, and it goes: here’s the day you had. Is anything missing?

And then back to your point, great fig tree hugger, your username cracks me up, it asks me about anything that I missed and how I feel about today. And it has become this really valuable end-of-day reflection tool for me.

Your point about doing a sentiment analysis is interesting. The tricky part, I guess, is I’m not sure how much of my mood comes across in the chat, maybe more or less than I think. Maybe because I’m in written communication mode for so much of my day. The system has access to all of that stuff, but I don’t know. I’ll do it and I’ll report back and let you know.

And actually, if there are any tips you have on sentiment analysis prompting, either resources or anything you’d be willing to share, I’d love to see that and see what it can inspire, and see if that’s actually helpful. It might be an interesting thing if it can guess that.

One of the other related ideas: I don’t wear a sleep tracker for heart rate and stuff like that, but I do have one of the heating and cooling systems on my bed and that does have a sleep quality monitor on it. I thought about doing something where based on how it guesses my sleep quality, it informs the system and when it starts prepping my morning, it decides what kind of tasks to frontload based on how well or poorly I slept. That is technically a thing that could happen. I’m not sure if I want to do that, but the idea has crossed my mind.

I do think that sentiment scoring is a neat idea. If you have anything you’d want to share, drop me an email or tag me on Twitter, or shoot me a DM on Twitter. I’ll do my best to check there as well. I often forget, I gotta get my robot to do that for me.

[Chat]: “The mechanics are an important part of the methodology.” I really appreciate that. Thank you. “Kidding about this, but you could also take a screenshot or a camera picture in the chat and it does a visual analysis of the emotions on your face.”

Oh man. That is a really good bad idea. That is a really good bad idea.

[Chat]: “Vader sentiment.” Okay, cool. Is that a thing I can look up? Let’s see. Vader. Cool. All right, that’s super helpful. I’m going to check that out and I’ll let you know how that goes. That’s pretty cool.

All right. I have a few more things to cover probably in the next ten or fifteen minutes or so. Let’s go back, we only answered two sections here. Let’s move on to the next.

I appreciate you all hanging out. This has been fun. I think I will definitely do more of this, and then I’m going to figure out how to chop this long video up into smaller videos.

All right. This is an interesting one. Very few people have asked about what I think is actually one of the most powerful tools that I have, and it is super simple but insanely powerful. So let’s talk about what I wanted and then we’ll talk about how I did it.

I have tried every CRM tool, every relationship tool, even personal relationship management tools, and they all suck. And by that I mean they’re all designed around one of two things. All the CRMs are designed around sales and pipeline, and that’s just not how I think about relationships. Or they’re based on at best time duration, which is how long since you’ve talked to this person, which is useful and I have that too, but that was not my primary interest.

The thing that I want is something that helps me manage the fluid and ever-changing dynamics of each relationship, and to help me really understand and reflect on the relationship and go: am I showing up for the person the way I want to? The “how long since you’ve messaged this person” is a good start to that, but I also think it’s very superficial and one-dimensional and it doesn’t actually help you do the thing that you said you want to do.

I wanted a tool that helps me show up better for the people that I care about. Whether that is a family member, my partner, my friends, my coworkers, Indy Hall members, my community, whatever it is, I wanted a tool that actually helps me show up better for people.

What I realized is - and I said the other problem is - I think everyone’s like, “Well, use Clay.” Clay is awesome, and Clay is probably one of the best options out there. But Clay is like data import, it pulled in everything, it sucked in so much information that I now have a new problem, which is signal to noise. And so I was like, how do I start with a system that actually pulls information about actual relationships, not every contact in my inbox or in my address book that I may have talked to once? That’s not very useful.

The idea that I built on is one I haven’t seen anybody else do, and it’s super simple. My email is a very powerful place, and I’ve also used the same email address for over 20 years. At the same time, not everything in my email is a relationship. But most people who I have sent an email to, the recipient of a sent email, is a very good clue that that is a relationship that I care about. You may be different, maybe it’s more like text messages or something like that. But for me, that is a very high signal. I generally don’t send messages to people that I don’t have any interest in talking to.

So what I built is a little tool that once a day looks at all of the emails that I sent. It looks at the recipients of those emails, and I’m doing some other things for email threads where it only counts the people who actually participated in that thread as an active recipient, therefore a relationship, and it uses that to generate a list of people.

Then I have a person researcher agent that goes into my inbox and looks for everything it can find about our relationship. It is very heavily weighted on my content and context, not the things that people send to me but the things that I send to them. It can look at other context, but that is a secondary thing - sort of backing details. The reason for that is back to ethics and privacy: those people didn’t sign up to have their stuff pumped into a database. So I don’t want to put everything in there that is somebody else’s message, but if there are necessary supporting details, it can pull those out.

And what is amazing is whether it’s a brand new relationship or somebody that I’ve known for 20 years, it’s able to build a dossier of our relationship that is truly incredible.

To give you a concrete example without - let’s see who would be okay with me showing them - we’ll just show off Chris’s. I’m not going to show all of it. So Chris is a friend and a longtime Indy Hall member. He’s also somebody that I have been exchanging messages with since April 2009, so 17 years, something like that. Using just our email exchanges, it was able to compose all of this. I’m going to scroll very quickly just to get an idea of how much is in here. It generated all of this.

Why is that interesting? Well, not only do I have that, and it’s searchable, which makes it useful, it is also reference material for the rest of the system. The way this shows up most commonly for me is the thing that I built this entire system for.

When I save a link or when I get a link, and I’m going to show you how this works because we now have this Vader sentiment thing, when I save this now, you’ll notice I just dropped in a GitHub link with no other context whatsoever. This is going to take a second, but once it does, it’s going to obviously go look at what’s on that page. It’s smart enough to know the difference between different kinds of links, GitHub, YouTube links, articles, news articles, so on and so forth. But it also goes and looks at the page, brings back a little summary of what it is, and looks for potentially relevant connections in my system. This one says, “No existing connections to this project found in your knowledge base or relationships.”

So think about what just happened here. It gathers up all this detail and it’s giving me options: save to knowledge base, build something with it, just browsing, whatever it is. But if it had relevance to either a specific project I was working on or a specific person based on that relationship file, it would mention them here.

Where that’s become amazing is when I’m saving links for myself, this helps remind me of other people that might be interested in that link, and then I can go send them the link and say, “Hey, I found this, you might think it’s cool.” And that ties back to the whole point of a relationship management system for me, helping me show up for the people that I care about. This helps me remember that people I know are interested in the same things that I am, and when I find cool things, they may want to see them. That has been a game changer. It’s amazing. And that shows up in a number of other places as well, but this is the easiest place for me to demo it.

In this case, I’m going to say “save to knowledge base.” And what it’s going to do is grab all the information, go generate a file. I’m going to let it do it, and then I’ll show you what it generates in just a second.

Often I don’t even give it that much information, I can just click one of those buttons. Those autoresponse buttons, by the way, that is a custom thing that I built. I saw that Claude Code was doing the suggested next prompt thing, and I was like, that’s actually a pretty interesting idea, for lots of times when it’s just asking me something. It’s not useful for a lot of things, kind of weird for a lot of things, I should say. Where it is useful is when there’s a yes, a no, or a multiple choice answer, and I’m just retyping stuff that’s already on the screen. Why not just make it so I can click that thing and not have to waste time, but also potentially waste tokens? So it’s a button, and the button sends a version of the message back. In some cases it sends the shortest version. If it is a multiple choice thing where I don’t just want to send the yes or no back, but I actually want to send back enough context, it autogenerates that message.

That is especially useful when I’m on mobile, it just keeps me from having to type “yes, no, thumbs up, go,” whatever. It makes things that would take 50 taps take one. It’s a huge time saver and probably helping me delay getting carpal tunnel at some point, based on being on computers my whole life.

So this saved that to personal tools knowledge. You’ll notice it figured out where to put it because it knows that I put my GitHub links in tools. Usually, if this was an article, it would save it to articles and file it in a particular related article section. So if we come into the knowledge base here, you can actually see what that looks like, my tools section.

No. Where did it just put it? Y’all catch where it put it?

Oh, it was reference tools and then it got put in here because I’ve got some consolidation to do. That’s one of the things I run into from time to time, if I didn’t explicitly tell it to look for a place, it is possible for it to duplicate. But the cleanup process is easy, and whenever I do a cleanup, I ask it to create a rule that will prevent that from happening again in the future.

What was it that we just did? It was Vader, right? So let me just search for it.

Interesting, not totally sure why that’s not opening up, but I’m also not going to stress about it right now. I’m guessing it’s a caching issue and this will show up a little bit later. At any rate, let’s go back to where we left off.

I see we’ve still got a decent number of people in the chat, which is really cool. If there’s anyone who has not said hello in the chat yet, I would love that, just say hey, where you’re from, and what you’re interested in. If you have any questions you can go ahead and drop them in there. I’m going to hang out for about ten more minutes. That’ll take us to almost the top of the hour.

So the relationship management system is crazy powerful and I haven’t seen anything like it. It is something that has impacted my day positively every day since I built it. It now also does things like monitoring how long since we’ve stayed in touch, but more importantly, it surfaces those things at timely moments rather than just nagging me that a relationship is getting cold. And it also looks for clusters of people that I can connect to each other, which I’ve never seen a tool do, at least not the way it is currently doing.

So it’ll notice that I’m doing a bunch of back and forth with a new contact and it’ll go, “Hey, does this person know these other people? You should connect them.” And I’ll go, “Yeah, that’s absolutely something I should do.” And then I go do that. That’s awesome.

[Chat - Jason]: Hey Jason, from Wisconsin. Right on. Thanks for stopping by. Glad to have you.

I’m curious if folks have built stuff like this or are building stuff like this. Are you sharing it on Twitter or anywhere else? I’d love to check out what you’re doing. Like I said, I’m sharing this because I’m hoping it’ll inspire other people to do stuff like it. And I’m always looking for new ideas that might help me in mine as well.

All right, we talked a little bit about this compounding and learning systems. There are two things I want to show off. One, I mentioned the memory system and I’m not going to rehash all of that, I want to show you this really simple open source bit that I pulled out of it. This is not the same as the memory system, but it is very much inspired by it. This is like the simplest version of the memory system. Specifically the part where, let me grab some water real quick.

Specifically the part where this doesn’t do the automatic memory injection, but what it does do is enable me to ask, “Hey, where did we leave off on this?” My trigger words are in here somewhere, and most commonly the way I use this is “where did we leave off with this?”

So some number of sessions ago, actually, I’m just going to give you a demo. I’m going to say where did we leave off? The Mac app that I’m working on, which is a thing we’re going to have to save for another session, where did we leave off last with Pear Snap?

What this basically does, and I’m trying to remember exactly how I’ve set this up, I might have to manually invoke it, let’s see. Let’s just do it this way because this is roughly how Kuato works as well.

Actually, I’m going to start a refresh because you saw what it was doing. It’s doing that thing that Claude always does where if it doesn’t know what it’s looking for, it’s going to go look for a bunch of stuff, fail, then look for clues until it finds the right thing. I wanted to skip over all that and point it in the right direction.

So Kuato works a lot like what you’re seeing here. When you have it installed, you don’t even need to say “use sessions API.” And like I said, Kuato is not a thing that is in my system, it has been extracted from my system for easier use in other people’s systems. But the way this basically works is it loads up as a skill and then it does a combination of text and vector search on whatever pieces of information I gave it, and it goes and finds the sessions where we talked about that thing. Then it uses a cheaper agent to analyze those sessions, and instead of returning the entire session into this session’s context, which would eat up tons of context tokens, it summarizes it. It’s almost like compacting from across multiple other sessions and then injecting the useful summary into the main session.

So what did it just do? Pear Snap is the name of the screenshotting app that I’ve been building for myself. And it says, “I found three sessions where you were building that. Let me get the most recent one so you can see exactly where you left off.” And it summarizes them. It says: most recent session, here’s what you were implementing, here’s what you did in the previous sessions, and here’s the current state.

That allows me to basically put things down and pick them up inside of 15 seconds or so, for anything that I’m working on. And I use this to work on most of my work now. So this is sort of like a “where did we leave off” for anything.

Everything that I found before this was reliant on all of the files being on disk, which obviously in my case they are not. And if you’re using this tool every day, you need to be clearing your sessions off disk anyway because it’s slowing your Claude Code down. If you open Claude Code and it does that loading pause, it’s because it’s loading up a ton of those JSONL files from past sessions. If you just zip up anything from more than 48 hours ago and move them to another folder outside of your global .claude folder, I guarantee your Claude Code will run faster.

I have that cleaning up automatically, but I don’t want to lose it, so it goes into a database. That answers your question.

[Chat - Jason]: Jason, for me the last three days of sessions are stored on disk, but everything else is stored in a database. What Kuato does is it gives you a sort of drop-in version of this where with no setup and just a single bun command, it will use the trigger words and search your on-disk files.

However, I highly recommend, it’s got a really simple Postgres container, so it’s fully self-contained, and what it’ll do is: this is not using embeddings and vectors, and I might add it. Although I’ll tell you, for the way that I’m using it here, not having embeddings has not created issues for me because I usually know what I’m looking for. Embeddings are useful if you don’t know what you’re looking for. And “where did we leave off on a thing I can’t remember” is not the workflow I have here. I know what I’m looking for, we’ve just talked about it across many sessions and going through all of them to figure out what the context was so I can continue is the problem.

A basic Postgres with weighted keyword extraction and stuff like that is a very, very simple startup. Basically, if you grab this repo and drop it in your Claude Code repo and say “hey, set this up,” it’ll ask you a couple of questions and five minutes later you’re going to have a fully running version of what I just showed you.

And again, the key here is not even the recall. I mean, the recall is obviously important. The key is that when it’s done with the recall, it summarizes it for you in a way where I can pick up exactly where I left off. That is kind of the magic of why I think it’s awesome.

How are we doing on time? All right, I’m going to do one more.

So we talked about this, we talked about memories next. And yeah, I have tried the other memory agents and I found that they’re either overbuilt, underbuilt, or most commonly extraordinarily developer-centric. I get why, but I’m not a developer, I’m not building developer tools. That’s not what I’m motivated by. I want the same idea but applied to a broader scope of work, and I don’t expect other people to build that for me because they don’t understand my full scope of work. So that’s why I built my own.

All right, this is actually a good point to land because it kind of brings us all the way back to the beginning, which is doing creative work without losing the creative parts.

A couple of the questions were around the screenshot app, Pear Snap, that I was just talking about, which I think I could show you parts of. So I’ve been using the same screenshot workflow, not app but workflow, for a very long time, to the point that the muscle memory and integration into my other workflows is everywhere. And when it breaks, it breaks my flow in really frustrating ways. It’s silly to get tripped up by a screenshot workflow, but it’s true.

I used a very early, like one of the first non-native screenshot apps for Mac. It was an app called Sketch from a guy named Chris Pearson, who I got to know in the early internet days, and that’s why my app is called Pear Snap. It’s a little bit of a tribute to my friend Chris.

All it really did was it was the first version where you could take a screenshot and mark it up, doodle on it. It was not complicated, but it was very powerful. He ended up selling it to Evernote. Sad. Evernote killed it. More sad.

So for the last six years, maybe longer, I’ve been using this Russian knockoff that like has worked but was having bit rot as Mac OS has upgraded. I know it has broken his app and all these other things, and I feel the pain as an indie developer of how hard it is to keep up with this stuff. But it got to the point where seven times out of ten when I take a screenshot it would crash. I just can’t do that.

And so I got this moment, this kind of fit of frustration, and I was like, “Wait a second, why don’t I see if I can get this thing to build it?” So I just described a couple of features that I needed that I use every day, that are not necessarily unique, but it’s the combination of them and the specific workflows and muscle memory I have for them that I just don’t have the time or desire to relearn. And the new option that I’ve never had before was: just build the thing around the steps that I want to take.

So here’s how my app works. For me, it’s command shift 5. I still have all the Mac shortcuts, but command shift 5 opens up and I get my little crosshairs. You’ll notice that it is also a color picker, which is handy. I get my hex and if I do command C, it puts that on my clipboard. That’s a bug, it’s putting everything white. That’s something I’ll fix later.

And then in addition to that, I can do my drag. Let’s put it over here. So this is a very simple, minimalist panel. Shows what I just took the screenshot of. With left and right keys I can go back and forth between that and past screenshots, which is pretty handy.

Most importantly, with no additional keystrokes, this image has already been uploaded to my Digital Ocean droplet, and the path to that image is on my clipboard. So I took a screenshot and that screenshot is now instantly a linked public image, which again I use all day, every day. I don’t have to explain why that’s useful. You have to understand that it’s so useful to me that this is the best version of it I’ve ever had. Now that I’ve been able to add other things like the color picker and whatnot, it’s just the best.

Obviously I can also grab this and drag it. And you can see that my Claude Code wrapper has image upload built in as well.

It’s not that I wanted a tool to build a tool. It’s not that I wanted to build a tool. I think a bunch of people assume that I built a tool that a bunch of things like it exist on the internet and I’m just building for building’s sake. But that’s not what this is. I do creative work all day long. All of my work is creative work, my business work is creative work. I think of it all as creative work. And often the tools that I’m using get in the way of my creative work instead of helping me do it better.

So Pear Snap, the JFDI system, all of it, the way I think of it is it has one job, or sometimes several jobs, but all versions of the same job. My goal is to make it so that when I sit down to do my work, the creative work that I love to do, that brings value to other people and sometimes makes me money, the things that would get in the way of me doing my creative work I can hand to the robot. When I sit down to do work and I feel some resistance, I go: what’s in the way, and can I give that to the robot so I can get to the part of the work that is actually good for me to do, is necessary for me to do, is valuable for me to do? That’s really at the heart of it.

So that’s why we’re choosing those kinds of things. Now, what made me choose Claude for Mac app development? I have chosen Claude for everything, and at this point it’s the tooling that I know. It was amazing to me that I could use it for Mac development, I thought it would be way harder. But I was blown away by basically saying, “Hey, I know that I can do a Mac app in Xcode. Can you help with that or is there a better way to do it?” And it said, “Hey, if we write it in Swift, Xcode needs to be installed, but we don’t even need to open it.” And that was another light bulb moment. I was like, “Oh, I can build Mac apps the same way I build web apps, where I’m comfortable.”

And I will say when it comes to debugging stuff, I have so much more to learn. But I know I can ask it for help. It’s not that I chose Claude Code for Mac development, it’s that I’m choosing Claude Code for everything right now. And I’m waiting for it to find a thing it’s not good at, and I haven’t run into one since like Thanksgiving, which is crazy.

[Chat - Dave]: As my friend Dave here said: the hard part is tweaking SwiftUI manually when you barely know what you’re doing. I feel that pain, but I’m using it as a learning tool.

[Chat - Damina]: Yes, I am still going. I’m glad you got a good walk in. Was it Damina? I’m wrapping up in just a couple of minutes.

And what is my kind of final point? For me, yeah, the bespoke tools like this, they fit like a glove because I got to design the glove. I’m going to be honest, I didn’t write that analogy. It did. But it’s spot on. That’s exactly it: it fits like a glove because I got to design the glove.

I get the sense of people that don’t want to design, and I feel like that is a loss. That’s a bummer. Because design is not about the visuals of it. Design is about taking a moment to reflect on what the problem is and what the solution could be, and using these tools collaboratively to figure out those solutions and then actually being able to build them. I don’t know, it feels like a magic trick. It’s the coolest.

All right. I’m going to wrap up here. This has been two hours on stream and my throat’s getting a little sore. I’ve got a phone call to make. But thanks for hanging out, and I’m going to go through the chat one more time before we go.

[Chat]: “What you build is fantastic. Addressing the problem of too many information silos.” Yep. “Building a version. Long way to go.” I mean, look, this didn’t start the way it looks now. This started as that thing that I just showed you probably 15 minutes ago, where I can drop a link and it autofiles it. That’s what I built. That’s the first thing I built. And once I watched that work, I was like, “Oh, well, what else can we do?” And there’s so much else that was able to go from there. So start small and build, build.

The relationship manager assessment, yeah, I’m glad we’re on the same page there. “Start scraping videos like this, super insightful earlier.” Yeah, I mean, I do the same thing. When I see videos or something interesting, I’ll grab it, grab the transcript, and I will learn from it. So I hope you learned something from here, and if you build something with it, tag me on Twitter or wherever. I’d love to know what you took, what inspired you, and you know, good open source is when you take something and you run with it and you try to give back. I’m trying to do that and I would encourage you to do the same.

Will I leave the video up? Yes, I’m going to leave the video up.

[Chat]: “Work my workflow in one sentence. And what are you building next related to this in another sentence?” My workflow in one sentence, ooh, that’s good. My workflow in one sentence would be: find the important work that’s needed to do my work, and give that work to the robot so that I can do more of my work, or just work less overall.

And then what am I building next related to this? That’s a great question. I have so many things. I think I want to, now that I built this little Mac screenshotting app and I realized how easy that could be, I think I want to build something that talks to the web app version of JFDI and the assistant so that I can do some things, not everything, but some things with the system without even opening a web browser. Right now I have to open a web browser. On my phone I use, this web app is also set up to be a PWA, so it is saved to my home screen and it is fast and powerful from my phone, and it is mobile optimized. But sometimes I don’t even want to open that app.

Maybe I’ll do a Raycast extension instead of building something from scratch, I’m not sure. But something like a Mac app, and then maybe a phone app to your point, might be the next layer on top of it in terms of new features.

But the truth is, my goal right now is to use this thing, not to build this thing. My goal is to spend more time using it than building it. Building it is fun, but using it is the point.

So that’s kind of where my head is there. Yeah, that’s it. Thanks for the great questions. Thanks for hanging out. This has been fun. I will definitely do more of this. Hit subscribe on YouTube if you want to get notified about the next one, and follow me over on Twitter as well. Hope to see all of y’all in the next one. Have a great rest of your weekend, y’all.

← All posts