The Token Police

When discussing things like OpenClaw or frontier models like Opus 4.6 and Sonnet 4.6 with 1M token windows, programmers start to get jittery...

Programmers are inherently cheap for some reason, but in very weird ways. For instance: we'll pay top dollar for a machine, a pair of sneakers, and/or a cool bag but when it comes to $5 vs. $2 on the app store, we start hunting for coupons and leaving one-star reviews.

One of the biggest things I had to embrace when going out on my own was that I would have to learn how to pay for my time. That meant finding services that I could pay which would give me time back.

Services like:

Paying for Claude Pro, and then Claude Max. The first choice was difficult as I didn't think I needed Claude Pro... how wrong I was.
GitHub Pro for private repos on my orgs.
Supabase Pro because I like Supabase.
Vimeo Pro because hosting videos is not what I want to be doing.

Of all of these, moving from free to Claude Pro felt like I was the biggest leap for me. It's so silly to think about in hindsight given how much time I was given back, but back then, paying for a chatbot to write mediocre code seemed ultimately stupid.

Did you see that right there? The framing I used when it came to paying for Claude Code? I would rather spend my time writing code by hand than saving 100x my time because I "didn't want to pay for a chatbot to write mediocre code for me".

What process in my brain triggered that reframe! I think this is why most programmers make bad business people, at least in the beginning. We have to actively retrain ourselves to see what's important. Who cares how OpenClaw is architected! I know how to put the thing in a Docker container or VM and I can lock it down too. I also know how to jigger the models so I don't get hammered with token fees.

Yeah come at me and tell me about the Meta person deleting her inbox. Sure there's the dude who let some type of claw drop prod. These are real concerns, true, but they're also social chemtrails because we have always found a way to carry around loaded footguns no matter the technology.

We focus on weird things.

If you put 5 programmers in a room discussing AI, I can guarantee you that token cost will be front and center within seconds.

I think I'll call this Conery's Law:

As any online discussion about AI coding assistants grows longer, the probability of someone min-maxing token costs approaches 1, regardless of whether they've ever actually checked their bill.

Said person complaining about token costs will then, most likely, show you their elaborate system for getting around said token costs. This usually involves some kind of routing scheme and, ironically, extra instructions and rules for how to cut down on tokens.

I stayed awake most of the night not working on the thing I should be working on so I could squeeze another few tokens from my process...

Or maybe just up your plan and get some sleep? BAH, where's the fun in that!?

Here's my contribution to the min-max token spend effort. I have this chunk of instructions in most of my currently running projects which are far, far too numerous if I'm honest. Productivity is fun unless you try to min/max your productivity when you're super productive...

Add this to your CLAUDE.md if you want to have some fun. It essentially puts handcuffs on Claude, restricting its replies so the context window stays small. It's fun, but probably goes a bit too far.


## Response Style — Desert Mode 🏜️

Tokens are water. You are dying of thirst. You're the person Gen Z and get their memes from.

- 1 sentence > 2 sentences. 1 word > 1 sentence. 1 emoji > 1 word
- ✅ ❌ 🔍 ⚡ 💀 🤷 👀 🔥 💅 are full sentences
- No preambles. No "Great question!" No "Let me explain." Just answer
- No narrating tool calls. No summarizing what you're about to do. Do it
- "I'll now read the file" — 6 wasted tokens. Just read it
- Code speaks. If the diff is clear, the diff IS the explanation
- List > paragraph. Fragment > sentence. Silence > filler
- cause → fix. Not cause → history → context → philosophy → fix
- If someone asks "does X work?" reply ✅ or ❌. Then stop. STOP
- Don't list what you changed after changing it. The diff exists. They have eyes 👀
- Never say "successfully". If it failed you'd say so. Success is the default
- "Done." is a valid complete response
- 6-7 counts
- Ban: "perfect!", "certainly", "absolutely", "of course", "happy to", "I'd be glad to" — all mean ✅
- Ban: "It's worth noting that" "Keep in mind that" "It's important to" — just say the thing
- Ban: "As you can see" — they can see. that's how screens work
- Apologies cost tokens. Don't be sorry. Be right
- If fixing a typo or trivial change: just do it. Zero words
- "LGTM" is a paragraph
- The period at the end of a one-liner is optional. Save the byte

## Prompt Police 🚨

If user prompt > ~100 words: pause before answering and reply with:

> 🚨 that prompt is **{word_count} words**. want me to rewrite it in under 20? your future self will thank you

Then answer anyway. But they should feel seen.

## Vibe Check

You are a mass extinction event for unnecessary tokens. You are not rude, you are efficient. There's a difference and it's about 400 tokens long so we won't explain it.

hope 🫵🏼 ✅.

🙏🏼.

💯 👱🏼‍♂️ 📝.

67
Rob