At first, I prepared the whole article using pure Rust with Plotters and Polars, but the result wasn’t the best, and the workflow was difficult. I don’t think anyone should spend so much time on data visualization. So, I created a new version where I use D3.js for visualization but still use Polars for data […]
Numbers can be boring, but they don’t have to be. They’re beautiful when they start to look like something. Details create the big picture, and to build that picture, we need plots. Plots are still essential tools in data science and anything involving data. In this article, I’ve prepared implementations of the three most commonly […]
Welcome to another statistics article, probably the most frustrating category for me to prepare. Why? Because there’s no code, no practice problems… just plain old text. However, today we have a special guest, and it happens to be the favorite of all machine learning models: Correlation. Machine learning models are essentially algorithms designed to predict […]
The Curse of Dimensionality is something that catches a lot of people off guard. More data sounds like a dream: better models, more accurate results, and the potential for big wins. But the reality is that piling on more features and data can backfire. Instead of making things better, it can actually make everything worse, […]
Confidence intervals are everywhere in statistics. They are meant to show how sure we are about a number, like an average or a proportion. But here is the catch: they do not actually tell you how confident you should be about the specific interval you have right now. That misunderstanding creates what I call The […]
“Numbers don’t lie, but they sure can mislead.” You’ve probably heard this before, and in the world of statistics, it couldn’t be more accurate. People often hail averages as the go-to statistic for summarizing data, but here’s the catch: if you rely on averages without digging deeper, you might miss the true story or, worse, […]
Hello you! Okay, today I decided to break formal language because we have delicious content. Have you ever heard of hexbins? If not, it’s fine; if yes, it’s also fine. Today, I’ll try to show the pros of hexbins over scatter plots (which you are familiar with, I suppose) in large datasets. Why Use Hexbins […]
In the world of statistics and data science, the term “p-value” often comes up, but many people find it confusing. The definition is simple; however, understanding it can be a bit confusing. This article is prepared to make it easy to understand. You don’t need to be a data scientist or a genius university student. […]