From numbers to vectors to matrices—the foundation of neural networks
Let's build up from the simplest to more complex structures. Every machine learning model works with these three building blocks:
Just one value. Example: a customer's tenure in months
Multiple values in a row. Example: one customer with 2 features (months, usage)
Shape: 1 row, 2 numbers → This is actually a 1×2 matrix!
A table of numbers. Example: 3 customers, each with 2 features
Shape: 3 rows, 2 columns → 3×2 matrix
A vector is just a special case of a matrix. The vector [10, 35] is actually
a 1×2 matrix (1 row, 2 columns). When we learned about vectors in Chapter 4,
we were already learning about matrices!
When we describe a matrix, we specify its shape as: rows × columns
4×2 matrix (4 rows, 2 columns)
Now that we understand what matrices are and how to organize data in them, let's learn how to multiply two matrices together. This operation is fundamental to neural networks—it's how data flows through each layer and gets transformed. We'll start with a simple example and build our understanding step by step.
The dot product takes two vectors and gives you one number. Here's the recipe:
Multiply corresponding elements (first with first, second with second, etc.), then add them all up.
Pattern Matching with Neural Networks
Imagine your SaaS company trained a neural network on thousands of past customers. The network discovered a pattern of what successful customers look like. Now you want to check: "Does this new customer match the successful pattern?"
Here's your customer: 10 months subscribed, 35 hours of usage per month
The neural network learned that successful customers follow a pattern represented by weights [0.5, 0.2] — the pattern signature discovered from analyzing training data.
Think of it like this: The weights [0.5, 0.2] describe what a "healthy customer profile" looks like. When we compute the dot product, we're asking: "How well does this customer align with the healthy profile?"
Where do weights come from? The neural network learns these weights by analyzing thousands of examples. It finds patterns like: "Customers with longer tenure AND higher usage tend to renew." The weights capture this learned pattern mathematically.
Let's compute how well this customer matches the successful pattern:
The score 12 tells us how strongly this customer matches the successful pattern. Higher score = better match = more likely to renew!
Breaking down the match:
10 months is solid tenure, contributes positively to health score
35 hours of monthly usage shows strong engagement
By analyzing thousands of historical customers, the neural network learned which score ranges correlate with success:
How to interpret scores:
This customer's score of 12 indicates they strongly align with patterns of successful customers!
The key insight: The dot product measures how well vectors align. When one vector is your customer and the other is a learned pattern, the dot product tells you how well they match!
This dot product gives us ONE prediction for ONE customer. Nothing new here—just what you learned in Chapter 4!
Now let's say we have 3 customers, and we want predictions for all of them. Let's use simple data:
Our weights: b = [1, 2] → b₁ = 1, b₂ = 2
Instead of treating each customer separately, let's stack their data into one matrix:
This is a 3×2 matrix (3 rows, 2 columns). Each row is one customer.
Now here's the powerful part. Let me show you exactly how we calculate ALL 3 predictions at once. We'll go row by row, and I'll show you both the formula and the actual numbers side by side.
This is just the dot product formula from Chapter 4!
Same formula, but now we use Customer 2's data
Same formula again, now with Customer 3's data
This is matrix multiplication! We did 3 dot products (one for each row) and got 3 predictions all at once. Every row of the first matrix did a dot product with the column of the second matrix.
Now that you've seen the complete example, here's the general pattern:
The Rule: Each row does a dot product with the column. That's all matrix multiplication is!
Let's make sure you've got it. Can you predict what the first element of the result will be?
First element: Take row 1 [2, 3] and do dot product with column [4, 2]
Second element: Take row 2 [5, 1] and do dot product with column [4, 2]
Final answer: [14, 22]
This simple operation—doing multiple dot products at once—is THE fundamental operation in neural networks. Every prediction, every layer, every training step uses matrix multiplication billions of times.
Dot product of first row with the weight column
Dot product of second row with the weight column
Dot product of third row with the weight column
Matrix multiplication = doing a dot product for each row. We did 3 dot products (one per customer) and got 3 predictions!
Click on any result cell to see how it's calculated! Watch the magic of matrix multiplication.