Categories
Data analysis & Visualization

Ruby on Steroids for Data Science

So, I’m back with another blog post and guess what, this one’s about Ruby again. Yes, I know this blog is supposed to be about Rust. But let’s be honest: bending rules is fun, and exploring different ecosystems is how we stay flexible, creative, and happy.

I’ve been exploring the Red Data Tools project recently, and I have to say, it’s one of those “hidden gem” communities doing solid work to bring data processing to Ruby in a serious way.

So today? I’ll be talking about some of the GitHub repos under the Red Data Tools umbrella. Buckle up. Okay, actually, I just picked a few gems I like, not all of them. Sorry!

Charty

Charty is a lightweight Ruby charting library. It’s kind of like Ruby’s version of Matplotlib or even a tiny D3.js (just not interactive). Think of it as Charts-rs‘s chill cousin who just wants to get the job done.

You can spin up visualizations like these in seconds:

Charty.bar_plot(data: penguins, x: :species, y: :body_mass_g)
Charty.box_plot(data: penguins, x: :species, y: :body_mass_g)
Charty.scatter_plot(data: penguins, x: :body_mass_g, y: :flipper_length_mm)

If you’re not working with some NASA-scale data and don’t need drag-and-zoom interactivity, Charty is more than enough.

GR.rb

Okay this one’s different, GR.rb is Ruby bindings for the ultra-fast, C-based GR Framework. It’s not “pure” Ruby, but who cares? It’s fast.

require 'gr/plot'

x = [0, 0.2, 0.4, 0.6, 0.8, 1.0]
y = [0.3, 0.5, 0.4, 0.2, 0.6, 0.7]

GR.plot(x, y)
GR.savefig("figure.png")

Yes, it needs native dependencies. Yes, it’s worth it.

RedAmber

RedAmber is the DataFrame library Ruby deserves, built on Apache Arrow, really fast, and surprisingly ergonomic.

require 'red_amber' 
include RedAmber

require 'datasets-arrow' # for sample data
dataset = Datasets::Diamonds.new
diamonds = DataFrame.new(dataset)

Arrow inside, Ruby outside. You can filter, group, slice, dice, whatever you want. And you get performance that can go toe-to-toe with Pandas for many tasks.

Red Chainer

Deep learning in Ruby? Sounds cursed, but it’s real.

Red Chainer is a Ruby port of Chainer (from Python), and yes, it supports CUDA. You can build complex models, train them on GPUs, and do it all in Ruby.

require 'chainer'
# Define models, train, evaluate – all in Ruby

Honestly, this one isn’t production-level ML yet, but it’s fun and educational. It proves that Ruby can go deeper than we think.

Do you want production-level ML? Hey, Rust is your bu… ohm, okay, no Rust advertisement this time.

Red Datasets

This is the gem I wish every language had by default. Yeah, when I think about it, why don’t they?

I mean, at the very least, Python should have it, right? It’s the so-called main language for data science, so just provide us with ready-to-use datasets, dude!

require 'datasets'
iris = Datasets::Iris.new
iris.each do |row|
  puts row
end

Quick access to common ML datasets like Iris, MNIST, Titanic, etc., with a consistent API. For testing, benchmarking, or just playing around, this is a gem.

Conclusion

Okay guys, let’s wrap this up.
Real talk happens here, just pure truth. No tricks (okay, maybe a few), no fancy words (well, there will be some), no corporate lies (tricks, yes; lies, never).

So here it is, honest review time:

  • Want performance? Ruby’s not quite there yet.
  • Want DataFrames? If you’re not working with huge datasets, it’s actually pretty useful.
  • Want neural nets? Ruby’s really not great for that (I won’t lie).
  • Want to have fun? Hell yeah! Ruby is your best choice. Yes, yes, yes (I remember the JoJo meme when I write so many ‘yes’.)!

If you love Ruby, you’ll definitely find something here that sparks joy. And if you’re a Rustacean reading this and thinking, “But… performance…”, well, maybe it’s time to chill a bit and let Ruby handle the early sketching before you port things to Rust.

Leave a Reply

Your email address will not be published. Required fields are marked *