Prompt Playbook: AI Fundamentals PART 3

Prompt Playbook: AI Fundamentals

Kyle Balmer
April 23, 2025

Hey Prompt Entrepreneur,

"It was the best of times, it was the worst of times..." It's the tale of two cities - the city of AI creators and the city of AI users.

Increasingly the world is split between the two. And knowing where we are as business owners is very important.

I was reminded of this divide at a tech event last month when a founder approached me after my talk. "I want to build my own language model from scratch. I don't want to rely on ChatGPT. I want complete control.”

Admirable sentiment. But….wooo…yikes.

This highlighted a crucial misconception that many entrepreneurs have when entering the AI space. Most of us will be citizens of the user city, not the creator city - and that's not just okay, it's smart.

Let’s get started:

Summary

Training vs. Inference

The two lives of prediction engines: training vs. inference
Why most entrepreneurs should use, not create, foundation models
The middle ground: fine-tuning and retrieval-augmented generation
Managing prediction failures (hallucinations) in practical applications
Building an AI strategy that leverages the training/inference distinction

A Tale of Two Cities: Creators vs. Users

Every AI model you interact with exists in two distinct phases, created by two very different groups:

Training: The expensive, time-consuming process where the model learns how to play the prediction game by analysing massive datasets. This is the domain of well-funded AI labs with specialised expertise. Think OpenAI, Anthropic, Google, DeepSeek.

Inference: The relatively quick, affordable process where the model uses what it learned to make new predictions in response to your inputs. This is where entrepreneurs and businesses typically enter the picture.

This distinction creates two separate "cities" in the AI ecosystem:

The Creator City is populated by organisations like OpenAI, Anthropic, Google, and Meta who have the resources to train foundation models from scratch. BIG boys with deep pockets, often funded by entire governments.
The User City is where the vast majority of entrepreneurs and businesses live, building applications and solutions on top of these foundation models

The good news is that living in the User City doesn't limit your ability to create extraordinary value. We can still make amazing things without having to own the land we build on. We’ll discuss why this is so.

Training: The Unseen Mountain of Work

We touched on training in the previous Part but let's dive deeper. During training, the model is essentially learning how to play the prediction game through repeated exposure to massive amounts of text.

Here's what makes training so demanding:

1. Data at Scale: Training requires hundreds of billions of words. GPT-4 likely trained on a trillion tokens or more - equivalent to reading millions of books and the entire internet. Remember that Meta got caught using Anna’s Archive to scrape basically ALL the world’s books? This is why - they need data.

2. Computational Resources: Training top-tier models requires thousands of specialised GPUs running for months, costing tens or hundreds of millions of dollars. This is why NVIDIA has rocketed over the last few years - they are the primary manufacturer of these GPUs.

3. Specialised Expertise: Creating these models requires teams of ML researchers with advanced degrees and years of experience. These engineers are increasingly following the best salaries (and stock options!) to Silicon Valley and Hangzhou.

4. Infrastructure Complexity: The technical infrastructure to manage training at this scale is a massive engineering challenge in itself. This is why Elon Musk went ahead and built Colossus and other companies are construction massive data centres and looking into building their own nuclear power stations.

BIG players moving BIG money. Nuclear power station and trade deficit sort of money.

Inference: Putting Prediction to Work

Once a model is trained, it enters the inference phase—when it actually generates responses to your prompts by applying what it learned during training.

Inference is dramatically less resource-intensive than training:

1. One-Way Process: The model is no longer adjusting its billions of parameters; it's simply using them to make predictions. Sure, there’s some training on your responses but that is of a totally different scale to what came before.

2. Single Task Focus: Rather than processing massive datasets, the model only needs to handle the specific text you've provided. It’s working on a tiny tiny subset of its knowledge.

3. Optimised Delivery: Companies have developed highly efficient systems for serving model responses at scale. Making this process fast and (energy) cheap is what gives them an edge over other companies.

The result? While training might cost hundreds of millions our API calls to OpenAI are fractions of pennies and our monthly subscriptions are only around $20/month.

Why Most Entrepreneurs Shouldn't Build Foundation Models

Given the stark contrast between training and inference, there are compelling reasons why most entrepreneurs should focus on using existing models rather than creating their own:

1. Economic Reality: Training foundation models requires capital investments that only the largest companies can justify. The compute costs alone would bankrupt most startups.

2. Expertise Gap: Building these models requires specialised knowledge in machine learning that takes years to develop. OpenAI et al. have been building these skills for years now and have the jump on most of the market. Most businesses need AI solutions now, not after years of skill-building.

3. Opportunity Cost: The time and resources spent trying to create a foundation model could be better invested in building unique applications that solve specific customer problems.

4. Competitive Disadvantage: By the time you could create a foundation model from scratch, existing providers will have released several new generations of more powerful models. They are already rocking and rolling and in production. This is also why we are seeing less competitors entering the market. Instead it is solidifying around a small group of players.

5. Diminishing Returns: For most applications, current foundation models are already good enough that the marginal improvements from creating your own wouldn't justify the cost. Sure, you might “control” the underlying model but how much does that actually matter when it comes to deploying your product or service.

In the startup world, we often talk about focusing on your unique value proposition and outsourcing everything else. Foundation models are the perfect example of something most businesses should outsource rather than build.

We do this all the time with technology. We don’t (generally!) try to reinvent programs that we use. We don’t make our own programming languages. We use what is available and adjust to our needs.

The Middle Ground: Customisation Options

The good news is that there's a middle ground between training your own models from scratch and using pure off-the-shelf solutions. These approaches give you many of the benefits of custom models without the astronomical costs:

Fine-Tuning: Teaching the Prediction Engine Your Specialty

Fine-tuning is the process of taking an off the shelf pre-trained model (say, GPT-4o) and adapting it to specific tasks.

Fine-tuning is like sending an experienced professional back to school for a specialised certificate. I might already have an accountancy degree but maybe I need some special training on certain tax issues. Cool - I can get that additional layer of information to sharpen my abilities.

The model already knows how to make predictions about language in general, but fine-tuning helps it make better predictions for your specific domain.

During fine-tuning, you take a pre-trained model and continue training it on a smaller dataset specific to your needs. This might be:

Your company's documentation
Examples of your brand's writing style
Specialised knowledge in your industry
Customer service interactions in your specific domain

Fine-tuning costs a tiny fraction of full training while delivering substantial improvements for specific applications. It's accessible even to small companies and individual developers through services like OpenAI's fine-tuning API or open-source models like Llama.

Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation is a very fancy way of giving our AI documents. We provide it with a reference library of material connected to the tasks it will be doing. It’s a little different to fine-tuning because we’re not actually running more training - we’re instead providing it supplementary material to refer to during inference.

RAG is sort of like giving an expert a specialised reference book to consult while they work. Instead of expecting the model to have memorised every fact during training, you provide it with relevant information at the time of prediction.

This approach:

Takes your prompt
Retrieves relevant information from your knowledge base
Includes this information as context for the model
Generates a response based on both the prompt and the retrieved context

RAG is particularly powerful because:

It keeps responses grounded in your specific data
It allows the model to access up-to-date information (you can update your data)
It dramatically reduces hallucinations for factual content
It lets you leverage company-specific knowledge that wouldn't be in training data

This all sounds very complex but at its base level it’s sort of like giving an AI access to our company’s Google Drive and all its documents. For most businesses, implementing RAG will deliver far more value than trying to train a custom foundation model.

Managing Prediction Failures

Even with these customisation approaches, prediction engines sometimes fail. We call these failures "hallucinations" – cases where the model generates convincing but incorrect information.

Here are practical strategies to manage these failures in business applications:

1. Fact-Checking Layers: Implement secondary systems that verify claims made by the AI against trusted databases or search results. Basically a secondary layer of AI that fact checks the outputs of the first. Or, if you’re feeling feisty, multiple layers of checks.

2. Human-in-the-Loop: For critical applications, keep humans in the review process to catch and correct hallucinations before they reach customers. Got your AI generating emails to customers? Have it place the emails in drafts first then a human checks and sends.

3. Domain Constraints: Limit the AI's responses to areas where accuracy can be more easily verified or where errors have lower stakes. This comes from focused prompting, basically restricting the AI’s outputs to what is relevant.

4. Clear Uncertainty: By default AI systems will try to answer your questions even if they have no idea what the correct answer is. They are agreeable. It’s very annoying! So, train your systems to express uncertainty when they don't have sufficient information, rather than generating speculative answers.

These supplementary approaches acknowledge that prediction is inherently probabilistic (as we covered in the last Part!) and build safeguards accordingly. By knowing exactly where AI models are prone to messing up we can better work around these limitations. Which is the whole purpose of this Playbook!

Try This With Your AI Tutor

Want to explore how these concepts apply to your business? Try this prompt with the AI tutor you built last week:

I'm considering implementing AI in [specific business function or product]. Help me understand:
1. Whether fine-tuning or RAG would be more appropriate for my specific use case
2. What kind of data I would need to collect or prepare for this approach
3. What verification mechanisms would be most important given the risks in my specific context

Additional Resources

If you want to dive deeper into these concepts, here are some valuable resources:

How AIs like ChatGPT learn (CGP Grey) - an oldie but a goody! CGP Grey knew that ChatGPT was going to be a big deal before the rest of us and made this very handy video on the basic mechanisms!

How Neural Networks Learn (3Blue1Brown): A more technical but highly valuable series on the actual training mechanisms. Much heavier on the maths but excellently explained.

What's Next?

Next up we'll explore the critical role of data in AI systems. It’s…more exciting than it sounds. Promise!

We'll discuss why data quality matters more than quantity, the different types of data used in AI, and how to develop an effective data strategy for your AI initiatives. This will help you make better decisions about what data to collect, how to use it, and how to maintain competitive advantage in the AI era.

Keep Prompting,

Kyle

When you are ready

AI Entrepreneurship programmes to get you started in AI:

70+ AI Business Courses
✓ Instantly unlock 70+ AI Business courses ✓ Get FUTURE courses for Free ✓ Kyle’s personal Prompt Library ✓ AI Business Starter Pack Course ✓ AI Niche Navigator Course → Get Premium

AI Workshop Kit
Deliver AI Workshops and Presentations to Businesses with my Field Tested AI Workshop Kit → Learn More

AI Authority Accelerator
Do you want to become THE trusted AI Voice in your industry in 30-days? → Learn More

AI Automation Accelerator
Do you want to build your first AI Automation product in 30-days? → Enrol Now

Anything else? Hit reply to this email and let’s chat.

If you feel this — learning how to use AI in entrepreneurship and work — is not for you → Unsubscribe here.