Welcome to another statistics article, probably the most frustrating category for me to prepare. Why? Because there’s no code, no practice problems… just plain old text.
However, today we have a special guest, and it happens to be the favorite of all machine learning models: Correlation.
Machine learning models are essentially algorithms designed to predict future outcomes based on patterns found in past data. That much, we already know. But here’s the interesting part: they don’t “think” like we do (okay, maybe you know that too, but act like you’re surprised anyway).
Our human brains are wired to seek causation, we want to know why something happens. In contrast, machine learning models care mostly about correlation, the statistical relationships between variables. If you’re new to those terms, don’t worry. We’ve got plenty of time to break it all down.
So, let’s start by defining both correlation and causation to make everything clear and accessible for everyone.
Definition of Correlation and Causation
Correlation is basically the statistical pattern or relationship between input variables (features) and the output variable (target). Sounds too technical? Simply put, it’s the common pattern in your data. It could be linear, circular, or even some weird, non-geometric shape. But in the end, it’s still a pattern.
Your machine learning models are just seeking patterns to guess future outcomes. For example, if you feed them the wrong information, they’ll make wrong predictions because, duh, they can’t tell it’s wrong. They’re just memorizing patterns, not understanding them.
Actually, that might be true for us too, if we learn wrong, we act wrong. But, this isn’t a philosophical article, so let’s move on.
But we’re human, and we seek causation, the idea that one event causes another. Kind of like when you watch a movie: you don’t just notice what happens, you want to understand why it happens.
Another example: we might want to know if smoking causes lung cancer, not just whether smoking is correlated with lung cancer.
So, machine learning models are designed to find patterns, while humans have evolved to find reasons.
Simply put, correlation is about finding patterns in data, while causation is about finding the reasons behind those patterns.
Why Not Causation?
Okay, maybe you’re thinking, “Why don’t ML models give a damn about causation and just focus on patterns?” Good question.
Here’s the answer: Causal inference requires controlled experiments or deep domain knowledge, which is usually unavailable, or just totally impractical, in most machine learning scenarios.
Still sounds too nerdy? Basically, ML models are built to maximize predictive accuracy using whatever data they’re given. They’re not designed to reason about why something happens or to analyze cause-and-effect (well… there are some exceptions, like causal inference models or structural causal models, but let’s pretend they don’t exist for now).
So, it’s more efficient for these models to just find patterns and make predictions, rather than spend extra time trying to reason about causality, which they’re not even good at.
That’s your job. You find the reasons behind the data. The model’s job is to notice what tends to happen.
How Models Learn Correlation
Now that we know what correlation and causation are, and why ML models love correlation more, it’s time to understand how the heck they learn correlation.
No matter the type, from linear regression to deep neural networks, ML models optimize an objective function that measures how well their predictions match the actual outcomes.
To do that, they first identify input features statistically linked to the outputs, then assign weights or parameters to those features to best fit the observed data, and finally adjust the parameters iteratively to reduce prediction error.
Demonstrating
Let’s say we have a dataset (I’ll create my own for now just to demonstrate) with two columns: Price and Quality.
Price | Quality |
---|---|
100 | 5 |
20 | 3 |
70 | 4 |
Yeah, of course, this data is fake, please don’t take it seriously, it’s just an example. Let’s say we want to predict quality, so we use a simple linear model:
Quality = w × Price + b,
where w is the weight (how much Price influences Quality) and b is the bias (a base value for Quality).
Initialize Weights
We start by initializing the weights to zero because, well, that’s what I learned in math class, you usually start with zeros. So, w = 0 and b = 0.
Predict using initial weights
Now, let’s predict the Quality for each Price using these initial weights:
Price = 100 → Prediction = 0×100+0 = 0
Price = 20 → Prediction = 0×20+0 = 0
Price = 70 → Prediction = 0×70+0 = 0
Calculate the error
Price | Actual Quality | Predicted Quality | Error |
---|---|---|---|
100 | 5 | 0 | 5 |
20 | 3 | 0 | 3 |
70 | 4 | 0 | 4 |
Adjust weights to reduce error
I won’t try all the numbers here, that’s the computer’s job! After training, the computer finds the best-fit values: w=0.04 , b=1.5
Test again with updated weights
For Price = 100: 0.04×100+1.5 = 5.5 (close to 5)
For Price = 20: 0.04×20+1.5 = 2.3 (close to 3)
For Price = 70: 0.04×70+1.5 = 4.3 (close to 4)
Now our model has learned the pattern! That’s basically how it works.
When Does Causation Matter in ML?
Okay, we said machine learning models really love correlation, it’s their wife or husband, whatever, but causation is not trash for ML models, at least not completely.
Certain applications demand causal reasoning, like medicine, economics, or policy-making, where understanding why something happens is essential for making safe and effective decisions.
In these cases, causal inference techniques are combined with machine learning to go beyond correlation.
Honestly, that’s really the only case I can find where causation truly matters. Sorry!
Conclusion
So yeah, machine learning models are all about patterns and correlations, they don’t really care about why things happen.
But sometimes, like in medicine or economics, we need to dig deeper and understand causation to make smarter choices. That’s when we mix causal reasoning with ML.
Other than that, correlation is their ride or die. And honestly, that’s just how machine learning models predict future outcomes.
No magic, just math.
I call it, “Sugar”