Ian's Digital Garden

Const Assertions

I'm currently working on a project which involves a lot of lower level data structures. By lower level I mean things like layout and bit positions, and exact sizes being important. As such, I have a number of pedantic lints enabled. One of the lints ... read more →

Optimizing Rust Builds with Target Flags

Recently I've been doing some work using Apache DataFusion for some high-throughput data pipelines. One of the interesting things I noticed on the user guide was the suggestion to set RUSTFLAGS='-C target-cpu=native'. This is actually a pretty common ... read more →

Ownership Benefits Beyond Memory Safety

Rust's ownership system is well-known for the ways it enforces memory safety guaranteees. For example, you can't use some value after it's been freed. Further, it also ensures that mutability is explicit, and it enforces some extra rules that make mo ... read more →

Unicode Normalization

Today I ran into an amusingly named place, thanks to some sharp eyes on the OpenStreetMap US Slack. The name of this restaurant is listed as "𝐊𝐄𝐁𝐀𝐁 𝐊𝐈𝐍𝐆 𝐘𝐀𝐍𝐆𝐎𝐍". That isn't some font trickery; it's a bunch of Unicode math symbols cleverly u ... read more →

Databases as an Alternative to Application Logging

In my work, I've been doing a lot of ETL pipeline design recently for our geocoding system. The system processes on the order of a billion records per job, and failures are part of the process. We want to log these. Most applications start by dumping ... read more →

The rust-toolchain.toml file

This isn't so much a TIL as a quick PSA. If you're a Rust developer and need to ensure specific things about your toolchain, the rust-toolchain.toml file is a real gem! I don't quite remember how, but I accidentally discovered this file a year or two ... read more →

Conserving Memory while Streaming from DuckDB

In the weeks since my previous post on Working with Arrow and DuckDB in Rust, I've found a few gripes that I'd like to address. Memory usage of query_arrow and stream_arrow In the previous post, I used the query_arrow API. It's pretty straightforward ... read more →

How (and why) to work with Arrow and DuckDB in Rust

My day job involves wrangling a lot of data very fast. I've heard a lot of people raving about several technologies like DuckDB, (Geo)Parquet, and Apache Arrow recently. But despite being an "early adopter," it took me quite a while to figu ... read more →

Quadrupling the Performance of a Data Pipeline

Over the past two weeks, I've been focused on optimizing some data pipelines. I inherited some old ones which seemed especially slow, and I finally hit a limit where an overhaul made sense. The pipelines process and generate data on the order of hund ... read more →

Posts tagged with 'rust'