Categories
Data analysis & Visualization

Keep It Simple: Exploring a CSV Entirely with Bash

I was scrolling through my inbox the other day when a random email caught my eye. Some small online store I vaguely remember talking to months ago had attached their full sales CSV 50k rows, no warning, no context. Just “here’s the data, thought you might find it interesting.” (This is all made up, obviously. […]

Categories
Machine Learning & AI

In the Long Run, Is AI Really a Friend?

Long time no see. I am writing another article after a long break, and honestly, I have no idea how many people will read this. Do you know why? Because no one searches on Google anymore. People just ask ChatGPT questions like “Is AI really a friend in the long run?” and they get a […]

Categories
Data analysis & Visualization

Ruby on Steroids for Data Science

So, I’m back with another blog post and guess what, this one’s about Ruby again. Yes, I know this blog is supposed to be about Rust. But let’s be honest: bending rules is fun, and exploring different ecosystems is how we stay flexible, creative, and happy. I’ve been exploring the Red Data Tools project recently, and I have to […]

Categories
Data analysis & Visualization

Visualizing Global CO₂ Emissions with Rust and D3.js

At first, I prepared the whole article using pure Rust with Plotters and Polars, but the result wasn’t the best, and the workflow was difficult. I don’t think anyone should spend so much time on data visualization. So, I created a new version where I use D3.js for visualization but still use Polars for data […]

Categories
Data Engineering

Polars: A High-Performance DataFrame Library for Rust

I know I’m late, but let me talk about Polars! I want to discuss Polars in Rust because it’s really important, it greatly enhances Rust’s capabilities for data science, which is our main focus here. This blog is still pretty new, so I’m writing this a bit late. Nothing else stopped me! Polars is a […]

Categories
Data analysis & Visualization

Speeding Up Data Processing with Parallelism in Rust

Data can be called whatever you want,hype, cash, or trash,but it doesn’t change the fact that, in today’s world, data is precious. It might change in the future, but for now, you’ve got to adapt to it. I’m not saying you should add machine learning models to everything or process large datasets for no reason; […]

Categories
Data Engineering

Building Your First ETL Pipeline in Rust

Okay, we’re on a streak with Rust articles. This is my third Rust article, and now I’ll be giving a practical guide to complement my previous theoretical ETL article. I assume you already know what an ETL pipeline is, or at least have read my previous article on the topic, so I won’t go into […]

Categories
Machine Learning & AI

Loading, Shuffling, and Splitting Datasets in Rust

In the previous article, I created a simple linear regression model, but I skipped the dataset splitting part. One of the readers pointed out that it’s crucial, and I shouldn’t have skipped it, even for that example. So now, I’m preparing this support article to address that request. Later, I’ll connect this article to the […]

Categories
Machine Learning & AI

Rapid Machine Learning Prototyping in Rust

Rust? Yes, Rust. Not even I, the one writing this article, am familiar with the language, if we don’t count this past week. So why am I writing about it with such limited knowledge? Well, first of all, it’s my blog, and I can write whatever I want. Secondly, the Linfa library looks pretty sick! […]

Categories
Statistics and Math

Understanding Correlation: The Beloved One of ML Models

Welcome to another statistics article, probably the most frustrating category for me to prepare. Why? Because there’s no code, no practice problems… just plain old text. However, today we have a special guest, and it happens to be the favorite of all machine learning models: Correlation. Machine learning models are essentially algorithms designed to predict […]