W.D. – r y x, r

Why does getting a job in tech suck right now? (Is it AI?!?)

August 17, 2024July 18, 2025 by W.D.

A lot of new CS grads have been noting that is really hard to get a job. I’ve personally been contacted by a couple people, including outside of Twitter, about the difficulty of finding a job. I’m sure if you’re reading this that you’ve heard some stories, too. Here I will attempt to provide some … Continue reading Why does getting a job in tech suck right now? (Is it AI?!?)

How to cut your Python Docker build times in half with uv

February 15, 2024March 4, 2024 by W.D.

March 4, 2024 update: This blog has been updated to reflect additions to uv as of 0.1.12, specifically the –system flag. See Addendum section for more information. When I heard that Charlie Marsh, the creator of Ruff, created a fast replacement for pip called uv, I dropped everything I was doing and added it to … Continue reading How to cut your Python Docker build times in half with uv

Should you ask data science job candidates this tricky math question?

June 26, 2023June 7, 2025 by W.D.

The question Quantian1 on Twitter poses the following “junior data scientist” interview question: (You can check the replies for my answer!) This is a fun regression math / intuition question. In the spirit of appreciation for these types of interview questions, here are a few more. I leave solving these problems as an exercise to … Continue reading Should you ask data science job candidates this tricky math question?

ChatGPT as a query engine on a giant corpus of text

March 28, 2023March 29, 2023 by W.D.

In the popular imagination, ChatGPT is an intelligent robot that you can talk to. However, it is a better first order approximation to think of ChatGPT as a query engine on a giant corpus of text scraped off the internet. It is important to state explicitly that ChatGPT is like a query engine on a … Continue reading ChatGPT as a query engine on a giant corpus of text

Intuitive Explanation of Arithmetic, Geometric, & Harmonic Mean

January 13, 2023June 22, 2023 by W.D.

If you Google for an explainer on the differences and use cases for the arithmetic mean vs geometric mean vs harmonic mean, I feel like everything you’ll find is pretty bad and won’t properly explain the intuition of what’s going on and why you’d ever do one or the other. In fact, you sometimes will … Continue reading Intuitive Explanation of Arithmetic, Geometric, & Harmonic Mean

Goodbye, Data Science

November 27, 2022November 28, 2022 by W.D.

This is more of a personal post than something intended to be profound. If you are looking for a point, you will not find one here. Frankly I am not even sure who the target audience is for this (probably “data scientists who hate themselves”?). I had been a data scientist for the past few … Continue reading Goodbye, Data Science

Caveats and Limitations of A/B Testing at Growth Tech Companies

November 7, 2022June 22, 2023 by W.D.

For non-tech industry folks, an “A/B test” is just a randomized controlled trial where you split users or other things into treatment and control groups, and then later compare key metrics across those groups and decide which one performed better, so you can learn whether the treatment or control group is preferable. For the context … Continue reading Caveats and Limitations of A/B Testing at Growth Tech Companies

Multiple Linear Regression in SQL with Only SUM() and AVG()

September 15, 2022April 3, 2023 by W.D.

This post is inspired by someone dropping this in my mentions today: The technique the authors use is cute, but it’s not a true arbitrary multivariate regression. They cheat a little bit using dummy variables for the majority of their coefficients. I respect it, but it’s not an arbitrary regression. Fortunately, it is possible to … Continue reading Multiple Linear Regression in SQL with Only SUM() and AVG()

Advent of Code 2021 in Google Sheets: First 4 Days

December 4, 2021December 4, 2021 by W.D.

You don’t need to be a “coder” to solve coding problems. If you know Microsoft Excel or Google Sheets, then you can solve these problems, too. There’s tons of overlap between coding and working in spreadsheets. Sign up for the 2021 Advent of Code here. Take a stab at these problems with Microsoft Excel or … Continue reading Advent of Code 2021 in Google Sheets: First 4 Days

Zillow, Prophet, Time Series, & Prices

November 6, 2021November 6, 2021 by W.D.

So I made a mildly controversial tweet. Lots of people enjoyed it, but the LinkedIn-adjacent section of data science Twitter is not happy about it. I want to provide as much context for it as I can here, clarify a few things, and correct myself on a few things, including on some negative stuff I … Continue reading Zillow, Prophet, Time Series, & Prices