Change is everywhere—stock prices fluctuate, weather patterns shift, and even your preferences evolve over time. For AI to make smart decisions and improve its performance, it needs a way to understand and measure change precisely. This is where derivatives come in, providing AI with a mathematical tool to analyze how small adjustments lead to better outcomes.
Think of derivatives as AI's way of asking: "If I change this setting just a little bit, how much will my performance improve or get worse?" This ability to measure change is crucial for AI systems that need to optimize themselves, whether they're learning to recognize images, translate languages, or recommend content.
We'll see just how important the derivative is in later sections!
What is a Derivative?
A derivative measures how much a function's output changes when you make a small change to its input. It's like having a sensitivity meter that tells you how responsive something is to adjustments.
🚗 Car Speed Example: When you're driving and press the gas pedal slightly harder, how much does your speed increase? The derivative tells you this rate of change. If pressing the pedal a little gives you a big speed boost, the derivative is large. If it barely affects your speed, the derivative is small.
Mathematically, if we have a function , its derivative is denoted as or .
🍋 Lemonade Pricing Example: Let's revisit our lemonade stand, but this time consider a more realistic scenario. Simply selling more cups isn't the only factor—price matters too. If you charge too little, you don't earn much per cup. If you charge too much, fewer people buy.
Let’s define a new function that includes price instead of cups:
Remember we define a function as , so we can substiture other letters too. In this function, is your earnings and is the price per cup in dollars.
This function captures a realistic trade-off: at very low prices, you sell many cups but earn little per cup. At very high prices, you earn more per cup but sell very few. Somewhere in between is the sweet spot that maximizes your total earnings.
Derivatives and Graphs: The Slope of a Function
A derivative represents the slope of a function at any given point—how steep or flat the curve is at that location.
⛰ Mountain Hiking Example: Imagine you're hiking up a mountain with a trail that gets steeper and flatter in different sections:
- Steep uphill sections: Each step forward significantly increases your elevation (large positive derivative).
- Gentle slopes: Each step makes little difference in height (small positive derivative).
- Flat sections: Steps don't change your elevation much (derivative near zero).
- Downhill sections: Each step decreases your elevation (negative derivative).
At the very peak of the mountain, the ground is perfectly flat—the derivative is exactly zero. This is a crucial concept: when the derivative equals zero, you've found either the highest point (maximum) or lowest point (minimum) of a function.
AI systems use this principle constantly. When training a model to recognize cats in photos, the AI adjusts its settings to minimize errors. The derivative tells the AI which direction to adjust—and when the derivative reaches zero, it has found the optimal settings.
Finding the Optimal Solution
The real power of derivatives comes from their ability to help find the best possible outcome. This is exactly what AI systems do when they "learn"—they use derivatives to find optimal solutions.
Let's find the perfect price for our lemonade using derivatives. Starting with our earnings function:
The derivative tells us how earnings change with price.
The mathematical specifics of how we get this derivative are beyond the scope of this explanation, but it’s helpful to know that this new function is just a tool to understand how earnings behave at different price points.
To find the optimal price, we set the derivative equal to zero—this tells us where the peak of the earnings curve is:
Solving this gives the price where earnings are highest. This point is called the optimum because it’s where the function reaches its maximum value. If we now solve for variable , we get:
This means the optimal price is €2.50 per cup. At this price:
- Charging less (€2.49) means we're leaving money on the table.
- Charging more (€2.51) means we'll lose customers and earn less overall.
Again, we can plot this function in a graph to visualise what happens:

The graph clearly shows the peak in earnings at €2.50—exactly where the derivative equals zero and the slope is flat.
How AI Uses Derivatives to Learn
This optimization principle is fundamental to how AI systems improve their performance. Here's how it works in practice:
🌃 Image Recognition Example: When AI learns to identify dogs in photos, it starts with random guesses. The system calculates how wrong its guesses are (this is called the "error" or "loss"). Then it uses derivatives to determine:
- Which settings should be increased to reduce errors.
- Which settings should be decreased.
- How much to adjust each setting.
The AI keeps making these derivative-guided adjustments until it finds the optimal settings that minimize errors—just like finding the optimal lemonade price.
💬 Language Translation Example: Translation AI works similarly. It starts with poor translations and uses derivatives to gradually adjust its understanding of word relationships, grammar patterns, and context clues until it produces high-quality translations.
This process of using derivatives to find optimal solutions is so important in AI that it has a special name: gradient descent. We'll explore this powerful technique in detail in later chapters, but understanding derivatives is the essential foundation.
Final Takeaways
Derivatives provide AI with a mathematical compass for navigating toward optimal solutions. By measuring how functions change in response to small adjustments, derivatives enable AI systems to systematically improve their performance. Whether finding the best price for lemonade or optimizing complex neural networks, the principle remains the same: use derivatives to identify the direction of improvement and find the point where performance is maximized. This mathematical foundation makes it possible for AI systems to learn, adapt, and solve increasingly complex problems through continuous optimization.