Anatomy of an AI Assistant
If you have been building AI assistants - or trying to - you have probably had inconsistent results. One run might produce something genuinely useful. But the next, starting from what feels like the same brief, misses the mark.
That swing is often blamed on prompting. But in practice it is usually a design issue.
A prompt is a one-off request for a one-off output. An assistant is something you expect to behave reliably across many runs. It needs a role, clear rules, instructions that force an order of operations, and reference material it knows when to use.
So when the output swings from great to disappointing, the fix is rarely another clever sentence. The fix is usually strengthening the structure underneath: what the assistant is meant to do, what it should ignore, and what it must do first.
This article breaks down what an AI assistant is actually made of - what each part does, how the parts connect, and where most of the leverage lives when you want consistent results.
TL;DR
An AI assistant is built in two layers. The configuration layer - name, description, conversation starters, knowledge files, capabilities - defines the assistant's identity and resources. The instruction layer - introduction, tasks, knowledge file references, rules - defines how it actually behaves. Most people spend their time on the configuration and not enough on the instructions, which is where the core behaviour lives. Understanding this structure is the difference between filling in a form and designing a system.
An assistant is a system, not a prompt
When most people hear "AI assistant", they think of a chat window - type something in, get something back. That is how most people start using AI, and it works well enough for one-off tasks. But a custom assistant - the kind you build yourself, whether in ChatGPT, Claude, or another platform that lets you configure instructions and attach reference material - is something structurally different. It has a defined role, a set of instructions, reference material it can draw on, and rules about how it should behave - all configured to produce a repeatable outcome.
That distinction matters because it changes what you are actually doing when you build one. The work is closer to designing a small piece of infrastructure than writing a single request - something that should produce consistent, useful results across different inputs and different conversations, without requiring you to re-explain the context, the standards, or the approach every time.
When the outputs are inconsistent, the cause usually sits somewhere in that infrastructure rather than in how you phrased your request on a given day.
Two layers: configuration and instructions
Every custom assistant has two layers, and it helps to think about them separately because they do different jobs.
The first is what I think of as the configuration layer - the visible setup that defines the assistant's identity, its resources, and how someone begins interacting with it. This is everything you see when you open a builder interface: the name, the description, conversation starters, knowledge files, and capabilities.
The second is the instruction layer - the part that determines how the assistant actually behaves once a conversation starts. This lives inside the instructions field, and it is where the real structure of the assistant gets defined.
Both layers matter. The configuration layer supports the instructions, which is where the core behaviour gets defined.
The configuration layer: what each part does
Name - tells you and anyone else what the assistant is for. A name like "Client Proposal Drafter" communicates the job immediately. Something like "My Helper v3" does not. The name also affects how you find the assistant later, so specificity helps.
Description - a short summary of what the assistant does and when you would use it. Think of it as the sentence you would say to a colleague if they asked what this assistant is for. It does not need to be long - just clear enough that someone can decide whether this is the right tool for the task in front of them.
Conversation starters - the suggested prompts that appear when someone first opens the assistant. Something like "I need to draft a proposal for a new client" or "Help me prepare for a discovery call." These shape how the interaction begins and give the user a clear way in, rather than leaving them staring at a blank input field trying to work out what to type.
Knowledge files - reference material the assistant can draw on during a conversation. This is where you attach things like a voice guide, a method document, audience profiles, templates, or examples of previous work.
Here is the part that catches most people out: attaching a file does not mean the assistant will use it. You can upload a detailed voice guide, but unless the instructions explicitly tell the assistant when and how to reference it, there is no guarantee it will be consulted. Uploading is not the same as using.
Capabilities - features you can switch on or off, such as web browsing, image generation, or code execution. Most assistants do not need all of them, and it is worth keeping this minimal.
That is the configuration layer - the frame around the assistant. It defines what the assistant is called, what it has access to, and how a conversation starts. All of it matters. But only one part really determines how the assistant actually behaves.
Why the instructions matter most
The instructions are the part of the assistant that controls what it does when someone starts using it. If the configuration layer is the setup around the assistant, the instructions are the engine inside it.
Everything else in the configuration layer supports the instructions. Knowledge files give the assistant material to reference. Capabilities give it tools to work with. Conversation starters give the user a way in. But the instructions are what tie all of that into a coherent workflow.
This is why, when you sit down to build an assistant, the instructions are where most of your thinking should go.
What goes inside the instructions
There is a structure you can follow, and it breaks into four parts.
Introduction - define what the assistant is, who it is for, and what it is meant to achieve.
Tasks - the step-by-step actions the assistant takes when someone uses it. What does it ask for first? What does it do with the information it receives? In what order does it work?
Knowledge file references - connect each file to a specific moment in the workflow. The assistant needs to know not just that a file exists, but when it is relevant and what to do with it.
Rules and best practices - the guardrails that keep the assistant consistent and reliable over time. Things like: use UK English, do not fabricate statistics or testimonials, always confirm the brief before starting, keep the output under a certain length, never make claims the business cannot support.
Rules are what prevent drift.
How the layers work together
When someone opens an assistant, the configuration layer comes first. The name tells them what it is for, the description confirms whether it is the right tool, and the conversation starters give them a way to begin.
From there, the tasks from the instructions take over. The assistant follows the workflow you have defined - gathering information, processing it, referencing knowledge files where the instructions say to, and producing output shaped by the standards in the rules.
Each layer has a specific job. The configuration layer handles identity and access. The instruction layer handles behaviour.
Where to start
If you already have an assistant you have built, a practical next step is to open it up and check it against this structure. Three things are worth looking at:
- Does it have clear instructions with a defined workflow?
- Does it reference its knowledge files explicitly?
- Does it have guardrails that prevent drift on tone, format, scope, or accuracy?
Most assistants, when you look at them through this lens, have gaps in at least one of those areas.
If you are building assistants for your business and want to learn how to build and structure them properly - or you have built some already and the outputs are not consistent enough - I am happy to talk it through.

