Almost Timely News: 🗞️ Making AI More Efficient (2026-04-05)
Stop reinventing the wheel
Almost Timely News: 🗞️ Making AI More Efficient (2026-04-05) :: View in Browser
The Big Plug
👉 I’ve got a new course! GEO 101 for Marketers.
👉 Just updated! The Unofficial LinkedIn Algorithm Guide, March 2026, now with new information straight from LinkedIn!
Content Authenticity Statement
100% of this week’s newsletter content was originated by me, the human. You’ll see me working with Claude Code in the video version. Learn why this kind of disclosure is a good idea and might be required for anyone doing business in any capacity with the EU in the near future.
Watch This Newsletter On YouTube 📺
Click here for the video 📺 version of this newsletter on YouTube »
Click here for an MP3 audio 🎧 only version »
What’s On My Mind: Making AI More Efficient
In this week’s issue, I want to dig deeper and more technically from the Trust Insights Livestream we did this week on managing AI usage limits. In each week’s livestream, we focus on the practical, business-oriented, “so what?” (hence the name of the show) but there’s a bunch of stuff that gets left on the mental cutting room floor because there either isn’t time, or it’s a rathole that would detract from the main point.
This week’s newsletter is one of those ratholes.
The TLDR of the livestream is that with proper planning, governance, utilities, and model swapping, you can use AI efficiently and effectively. Go watch or read the livestream here.
Now, let’s go down that rathole.
Part 1: AI Reinvents the Wheel
One of the biggest hidden efficiency costs in AI, especially agentic AI, is thinking. We have these great models now, reasoning models that do rough drafts behind the scenes and have conversations with themselves to arrive at far better conclusions. If you remember from the early days of generative AI when we’d prompt a model to “think out loud step by step” or “show your work”, what we were doing back them was creating reasoning manually.
Now, for almost every model and tool on the market, that’s automatic. It happens behind the scenes, with that little “Thinking…” label that pops up in various interfaces. That’s fine for casual use, but when you’re using agentic systems like Claude Cowork, Claude Code, etc. - anything that has a usage limits - every scrap of thinking eats away at your usage quota.
Even more problematic is that when AI starts doing deep problem solving, like writing code, it understands from our instructions what it is we want it to do, and then it has to figure out how to do it. This results in it reinventing the wheel many, many times over. For example, in coding, if you want it to process a regular expression (regex), you might have that in the instructions, and then your favorite agentic system will write regex code.
Except… there’s absolutely no need to do that. There are thousands of different regex libraries and packages that exist, and instead of burning tens of thousands of tokens writing it from scratch, it could just say “Oh, I’ll use this pre-existing solution that solves the problem perfectly” in like, 10 tokens.
And this isn’t just coding. Every time you write a web page, draft a marketing campaign, do corporate strategy, anything where there’s a body of proven, existing knowledge, AI has a tendency to recreate it from scratch. This is an enormous waste.
Here’s why this matters: one way or another, every token you consume, you pay for. If you’re on a fixed-fee plan like Claude Max or Gemini Ultra, you have fixed limits for how many tokens you can consume or how many requests you can make in periods of time. Claude, for example, books by a five-hour window and a weekly window. If you use more than your allotted amount of usage, you either have to pay more or you can’t use the service until the next interval.
If you’re using your tool via its API, you are paying for every single token that goes through. And while some models charge very small amounts of money, like 50 cents per million tokens, when you’re doing things like agentic AI, you can consume several hundred million tokens in an hour.
The less AI we use, the less we pay for it, one way or the other.
One of the things I said to Katie on the livestream is that our goal, when using agentic AI, is paradoxically to use AI as little as possible. The more we can bring existing, pre-baked stuff into AI, the better results we’ll get because we won’t be asking AI to reinvent the wheel constantly, and we won’t be paying to repeat work.
This is something I’ve been teaching for years, the concept of knowledge blocks, pre-made chunks of data that you give to AI or make available so that it doesn’t have to repeat work. Last year the hype bros decided to brand this as “context engineering” but it’s fundamentally all about managing the knowledge we give AI.
Today, context engineering is all about not only the information we give AI, but also the utilities we give it. Command line interfaces or CLIs are tools that you install locally on the machine that you’re running your Agentic system on that are text-based apps. They look like they’re right out of from 1983, but those text-based apps are ideal for AI tools to use because they don’t have to click a mouse on anything, they can just type. And popular command line tools have been around for decades.
Some of the ones that I use these days? Google Workspace has a command line tool that allows a tool like Claude Code to take control of any part of your Google Workspace. Gmail, your Google Calendar. If it’s in Google, it can control it, which means that I can have it pull my agenda, plan things out. It’s fantastic.
Another one, the WordPress command line tool, allows a tool like Claude Code to programmatically manage your WordPress blog. It can write posts, it can rearrange things, it can turn on and off plugins, it can even validate new plugins.
A third one that I really love: the NotebookLM unofficial command line tool that allows a tool, again, like Claude Code or Claude Cowork, to access notebook LM from the command line and create new notebooks, upload sources, create those audio podcasts.
There’s no limit to the number of command line tools that are out there that you can give to a system like Claude Code or OpenWork or OpenCode or Qwencode, and instead of them having to reinvent the wheel constantly and burn that token budget, they can just pick up the tool, use it, and get great results by not using AI.
So how else do we use AI as little as possible while still getting fantastic results? Deep research, first principles, templates, and pointers.
Part 2: Deep Research
Humans have this concept called cognitive load; our brains can only hold so much or process so much at a time. We overload, and then we have to prioritize. You’ve experienced this kind of tunnel thinking or cognitive collapse when you just get overloaded and then you can’t do anything. Machines are no different. The more we ask them to do, the harder a time they have juggling. Unlike humans, we can measure when AI’s head is getting full, especially in tools like agentic coding tools:
⏺ Context Usage — Sonnet 4.6 (claude-sonnet-4-6)
162.7k/200k tokens (81%)
⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁
⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁
⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁
⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁
⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁
⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁
⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁
⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁ ⛁
⛁ ⛁ ⛁ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛝
⛝ ⛝ ⛝ ⛝ ⛝ ⛝ ⛝ ⛝ ⛝ ⛝When this memory window gets full, either the model starts to forget things, or in today’s agents, it has to compress details down to free up memory - and that often has serious accuracy consequences.
To avoid overloading an AI’s context window, particularly in agentic AI, the more we can provide that’s prebaked, that already exists, the less we have to worry about AI burning up tokens to reinvent the wheel.
One of the fastest and easiest ways to do this is to use deep research tools to do research and planning in advance before we begin a project.
Let’s say I wanted to make a new WordPress plug-in. I could just start talking to AI and vibe coding in the truest sense, where I don’t have a plan, I just have a conversation. And you will get results out of agentic tools like Claude Code, unquestionably. However, those results will use far more tokens than if we had sat down to build a research project first to look at what are the best practices, or what are the major features that we should have in our plugin.
As an example of how not to do it, I’m in the middle of rebooting the Trust Insights sales playbook. And honest confession, I have burned hundreds of millions of tokens to generate results so far that have not been great because I did not sit down and plan out the research first for the overall structure and grab the metadata needed to do it right.
In the next version I plan on doing, I’m going to follow the process strictly. Do the research first, frame out the architecture, grab all the relevant data, establish clear priorities, and only then hand it off to an agentic AI tool to actually build the sales playbook.
Part 3: First Principles
Deep research is unquestionably one of the cornerstones of any new project if you want it to go well. However, there are some other pieces that are also necessary. One of those is first principles. First principles is exactly what it sounds like: the first principles that AI agents must keep in mind as the most important priorities.
Here’s an example of what some of my first principles look like in coding:
Fix Over Create — Modify existing code; create only when radon cc ≥ C or structure mandates. No grade C or lower. Never average.
Reusable Testing — No one-off scripts; single quality utility in src/scripts/; tests in tests/
Never Defer — Clean code first priority; no “fix later”; no “out of scope”; no “unrelated”; hard work now
Idempotent Mutation — Execution multiplicity → identical state. Verify existing state pre-mutation
Test Coverage — 100% test coverage, 100% passing unit and E2E. < 100% = FAILURE
Never Reinvent the Wheel — Prefer existing proven FOSS packages/software instead of writing new custom code
These principles, and this is about half of them, help the machine understand what it always should be doing. These are things that I embed in my Claude Code master CLAUDE.md file as well as the project CLAUDE.md file. Because of the way Claude Code works, it rereads this file every time it starts a new session or compresses its internal session, which means that’s rereading the rules frequently. This cuts down on me having to tell it what the rules are and then burning more tokens, having it confirm what the rules are.
The last principle on that list is probably the most important one: to never reinvent the wheel. Instead of Claude rewriting and reinventing pieces of code over and over again, it knows that it’s supposed to go and look for pre-existing proven packages first.
If you’re doing something like coding, this is stuff that you should uncover in your deep research. If you’re doing stuff like market research or strategy, this is stuff that should also be done in the deep research. And ideally, you have URLs of good resources, or you can download the resources that you want to use, such as the 5P Framework by Trust Insights, for example.
Part 4: Reducing Randomness
Part 4 is all about reducing randomness by providing as much templated stuff as possible. Remember what I said in part one about thinking, and the more that a model thinks, the more tokens it burns. Another aspect of thinking is that because all generative AI is probabilistic, meaning that it is all probability-based. It is very difficult to get consistent answers out of it, especially for large tasks that you would hand off to agentic AI.
For example, if you ask AI to make a slide deck for you, today’s tools can do that very capably. Claude Cowork and Google Gemini can generate pretty decent slides. But the old cliche that a picture is worth ten thousand words has never been more true. Even with a detailed style guide, you will get some degree of randomness in the slide designs that it outputs, which means that they will not be exactly consistent.
If, on the other hand, you provide a slide template and a reference image to go with it, generative AI will have to think much less. Instead of it trying to decide what design to use, it will simply acknowledge the design you provided and use that instead. This cuts down on token usage, but it also cuts down on dissatisfying results.
Think about the things that we do want AI to think about. We want it to think about flow and narrative structure and description or the contents of a coding file. Think about the things that we don’t want to have AI think about, like brand standards or file system structure. The more we can define for it, the less it will think about those things.
Going back to part two when we were talking about the context window, the more AI has to think about something unrelated to the core task that we’re asking it to do, the worse it’s going to perform. If it has to think a whole bunch about design principles and how to lay out a slide, there’s that much less room in the context window for it to think about the contents of the slide and making it persuasive.
And all of that language about design principles is in the chat, which means it can screw with the content of what you want on those slides, and that’s a bad situation to be in. Give it the templates, give it the pre-baked information, and then that information is much less prominent in the chat, which means it has much less influence over the rest of the work that the agent is doing.
For any given project, the more you can have predefined, the more success you’ll get out of AI, and the better results you’ll get on the first try because it’s not having to think about all these different aspects of the task. It can focus only on the parts of the task that you want it to focus on.
This becomes critically important when it is not just you working on a task or set of tasks. If you’re working in a team or a company, chances are there are plenty of tasks that have a required standard we are all supposed to meet. There’s brand guidelines and how big the logo should be and what font we should use. All of that needs to be predefined. When we do that, we reduce randomness. When we reduce randomness, we reduce token usage, and we focus our token usage on the parts of the task that AI is best at.
An example of how I do this is I maintain a master Claude Code directory on my laptop. I have a directory of pre-baked goods, from reference documents to
Part 5: Wrapping Up
No matter what AI system you use, agentic or not, fixed fee or pay by the token, everyone benefits from using AI more efficiently, and using it less - especially for tasks where you don’t want randomness. I’ve said for a while now that the more data you bring to the party, the better the party you have with AI.
When it comes to spending less money on AI or using it more sustainably or getting to results faster, the more you can take away from AI and turn into deterministic tasks that are still code, right? They’re still scripts, they’re still programmatic. So it’s not you taking on the work again that you had given to AI. It’s taking away the structured parts that AI doesn’t need to do so that it can focus on the parts that it is best at and still take things off your plate.
How Was This Issue?
Rate this week’s newsletter issue with a single click/tap. Your feedback over time helps me figure out what content to create for you.
Here’s The Unsubscribe
It took me a while to find a convenient way to link it up, but here’s how to get to the unsubscribe.

If you don’t see anything, here’s the text link to copy and paste:
https://almosttimely.substack.com/action/disable_email
Share With a Friend or Colleague
Please share this newsletter with two other people.
Send this URL to your friends/colleagues:
https://www.christopherspenn.com/newsletter
For enrolled subscribers on Substack, there are referral rewards if you refer 100, 200, or 300 other readers. Visit the Leaderboard here.
ICYMI: In Case You Missed It
Here’s content from the last week in case things fell through the cracks:
Stop Using Yesterday’s AI Tricks: A Bruce Lee Guide to Surviving the Intelligence Revolution
Idea Velocity: Why Capturing Your Next Big Idea in Seconds Could Be Your Most Valuable Skill
Almost Timely News: 🗞️ Terraforming the AI Use Case Desert (2026-03-29)
INBOX INSIGHTS: How To Manage Overwhelming Productivity, AI Digital Clone Part 2 (2026-04-01)
In-Ear Insights: The 3 Phases of GEO And AI Search Visibility
On The Tubes
Here’s what debuted on my YouTube channel this week:
Skill Up With Classes
These are just a few of the classes I have available over at the Trust Insights website that you can take.
Premium
Free
👉 New! From Text to Video in Seconds, a session on AI video generation!
Never Think Alone: How AI Has Changed Marketing Forever (AMA 2025)
Powering Up Your LinkedIn Profile (For Job Hunters) 2023 Edition
Building the Data-Driven, AI-Powered Customer Journey for Retail and Ecommerce, 2024 Edition
The Marketing Singularity: How Generative AI Means the End of Marketing As We Knew It
Advertisement: New GEO 101 Course
When I talk to folks like you, being recommended by AI is one of your top marketing concerns in 2026.
We’ve taken everything we’ve learned from OpenAI’s documentation, Google’s technical papers, patents, sample code, plus our years of experience in generative AI to assemble a high-impact 90-minute course on GEO 101 for Marketers.
In this course, you’ll learn:
The three distinct phases of GEO and how they work
How to optimize for each phase (they’re different!)
How to measure your GEO efforts in a meaningful and valid way
This course is meant to be used. In addition to the course itself, you’ll also receive:
Your 90 day GEO action plan
How to set up Google Analytics for measuring GEO traffic
How to join Google Search Console data with GEO intent data
How to use our free AIView tool to improve your content and site for one of the three phases of GEO
A certificate of completion from TrustInsights.ai
And best of all, this is our most affordable course yet. GEO 101 for Marketers is USD 99 and is available today.
👉 Enroll here in GEO 101 for Marketers!
Get Back To Work!
Folks who post jobs in the free Analytics for Marketers Slack community may have those jobs shared here, too. If you’re looking for work, check out these recent open positions, and check out the Slack group for the comprehensive list.
Advertisement: My AI Book!
In Almost Timeless, generative AI expert Christopher Penn provides the definitive playbook. Drawing on 18 months of in-the-trenches work and insights from thousands of real-world questions, Penn distills the noise into 48 foundational principles-durable mental models that give you a more permanent, strategic understanding of this transformative technology.
In this book, you will learn to:
Master the Machine: Finally understand why AI acts like a “brilliant but forgetful intern” and turn its quirks into your greatest strength.
Deploy the Playbook: Move from theory to practice with frameworks for driving real, measurable business value with AI.
Secure Your Human Advantage: Discover why your creativity, judgment, and ethics are more valuable than ever-and how to leverage them to win.
Stop feeling overwhelmed. Start leading with confidence. By the time you finish Almost Timeless, you won’t just know what to do; you will understand why you are doing it. And in an age of constant change, that understanding is the only real competitive advantage.
👉 Order your copy of Almost Timeless: 48 Foundation Principles of Generative AI today!
How to Stay in Touch
Let’s make sure we’re connected in the places it suits you best. Here’s where you can find different content:
My blog - daily videos, blog posts, and podcast episodes
My YouTube channel - daily videos, conference talks, and all things video
My company, Trust Insights - marketing analytics help
My podcast, Marketing over Coffee - weekly episodes of what’s worth noting in marketing
My second podcast, In-Ear Insights - the Trust Insights weekly podcast focused on data and analytics
On Bluesky - random personal stuff and chaos
On LinkedIn - daily videos and news
On Instagram - personal photos and travels
My free Slack discussion forum, Analytics for Marketers - open conversations about marketing and analytics
Listen to my theme song as a new single:
Advertisement: Ukraine 🇺🇦 Humanitarian Fund
The war to free Ukraine continues. If you’d like to support humanitarian efforts in Ukraine, the Ukrainian government has set up a special portal, United24, to help make contributing easy. The effort to free Ukraine from Russia’s illegal invasion needs your ongoing support.
👉 Donate today to the Ukraine Humanitarian Relief Fund »
Events I’ll Be At
Here are the public events where I’m speaking and attending. Say hi if you’re at an event also:
SSI, Charlotte, April 2026
The Trust Insights Generative AI Workshop, sometime this spring!
SMPS AI Conference, Austin, November 2026
MarketingProfs B2B Forum, Boston, November 2026
There are also private events that aren’t open to the public.
If you’re an event organizer, let me help your event shine. Visit my speaking page for more details.
Can’t be at an event? Stop by my private Slack group instead, Analytics for Marketers.
Required Disclosures
Events with links have purchased sponsorships in this newsletter and as a result, I receive direct financial compensation for promoting them.
Advertisements in this newsletter have paid to be promoted, and as a result, I receive direct financial compensation for promoting them.
My company, Trust Insights, maintains business partnerships with companies including, but not limited to, IBM, Cisco Systems, Amazon, Talkwalker, MarketingProfs, MarketMuse, Agorapulse, Hubspot, Informa, Demandbase, The Marketing AI Institute, and others. While links shared from partners are not explicit endorsements, nor do they directly financially benefit Trust Insights, a commercial relationship exists for which Trust Insights may receive indirect financial benefit, and thus I may receive indirect financial benefit from them as well.
Thank You
Thanks for subscribing and reading this far. I appreciate it. As always, thank you for your support, your attention, and your kindness.
Please share this newsletter with two other people.
See you next week,
Christopher S. Penn




The “use AI as little as possible” paradox is the single most important insight in the current AI landscape — and almost nobody is saying it. This piece should be required reading for anyone burning through token budgets wondering why their results are inconsistent.
After a decade in energy trading and risk management, the parallel that jumped out immediately is how we approach risk modelling. The best trading desks don’t run every calculation from scratch each morning. They maintain pre-built libraries of validated pricing models, historical volatility frameworks, and standardised scenario templates. The analyst’s job isn’t to rebuild the infrastructure — it’s to apply judgment to the output. You’ve described exactly the same architecture for AI: pre-baked knowledge blocks, first principles, templates, and deterministic scaffolding that frees the probabilistic engine to focus only on the parts where it genuinely adds value.
Your context window analogy maps perfectly to something I’ve experienced building AI-assisted energy research workflows. When I loaded everything — market data, geopolitical context, refinery specifications, shipping routes — into a single agent session, the quality degraded noticeably as the window filled. The moment I restructured into modular knowledge layers with pre-indexed reference material that the agent could pull selectively, the output quality jumped dramatically while token consumption dropped by roughly 60%. Same insight you’ve described, different domain.
The CLI tooling point is where this gets really powerful and most people stop too early. In energy trading, we’ve used command line interfaces for decades — automated data feeds, position management scripts, risk calculation engines. The realisation that agentic AI can leverage these existing tools rather than rebuilding their functionality from scratch is genuinely transformative. Every pre-existing CLI tool you connect is thousands of tokens you’ll never spend again.
The first principles framework is the part I’d emphasise most for anyone building domain-specific AI workflows. In energy risk management, we call these “trading mandates” — the non-negotiable rules that every decision must comply with before anything else gets considered. Embedding them so the agent re-reads them on every session restart is exactly the right architecture. Without that, you spend half your token budget re-establishing constraints that should be permanent fixtures.
One dimension I’d add: version-controlling your knowledge blocks and first principles the same way you’d version-control code. As your domain knowledge evolves — new market conditions, new regulatory frameworks, new best practices — your pre-baked materials need to evolve with them. The teams I’ve seen get the most value from agentic AI are the ones treating their knowledge layer as a living asset with its own update cycle, not a static document written once and forgotten.
Exceptional framework. The people spending the least on AI right now are the ones who invested the most in the architecture around it. That’s a lesson the entire industry needs to absorb.
Spot on, Christopher. 2026 is definitely the year of the 'Agentic Workflow.' The focus shouldn't be on how to prompt better, but on how to architect better systems. I’ve been using Zaturn.ai to streamline our operations by treating the platform as a coordinated department rather than a series of one-off tasks.