This post is inspired by someone dropping this in my mentions today: The technique the authors use is cute, but it’s not a true arbitrary multivariate regression. They cheat a little bit using dummy variables for the majority of their coefficients. I respect it, but it’s not an arbitrary regression. Fortunately, it is possible to … Continue reading Multiple Linear Regression in SQL with Only SUM() and AVG()
You don’t need to be a “coder” to solve coding problems. If you know Microsoft Excel or Google Sheets, then you can solve these problems, too. There’s tons of overlap between coding and working in spreadsheets. Sign up for the 2021 Advent of Code here. Take a stab at these problems with Microsoft Excel or … Continue reading Advent of Code 2021 in Google Sheets: First 4 Days
So I made a mildly controversial tweet. Lots of people enjoyed it, but the LinkedIn-adjacent section of data science Twitter is not happy about it. I want to provide as much context for it as I can here, clarify a few things, and correct myself on a few things, including on some negative stuff I … Continue reading Zillow, Prophet, Time Series, & Prices
I have deliberately avoided using this blog to engage in overt political discourse, but I’ve never barred myself from political metadiscourse. Political discourse is about our beliefs on governance; political metadiscourse is more about understanding how people arrive at those beliefs and how those beliefs are expressed. I’ve been posting on the internet for over … Continue reading Proximate Cause & Theories of Agency
Update: The code for these animations is available here. Another Update: I think some of the explanations on this page may be helped with more colors. I have some updated visuals here that include colors. The Frisch-Waugh-Lovell theorem states that within a multivariate regression on and , the coefficient for , which is , will … Continue reading Frisch-Waugh-Lovell Theorem: Animated
Bitcoin Machine Learning. Bitcoin Machine Learning. Bitcoin Machine Learning. Bitcoin Machine Learning. Bitcoin Machine Learning. Bitcoin Machine Learning. Bitcoin Machine Learning. Bitcoin Machine Learning. Bitcoin Machine Learning. Bitcoin Machine Learning. Bitcoin Machine Learning. Bitcoin Machine Learning. Bitcoin Machine Learning. Bitcoin Machine Learning. Bitcoin Machine Learning. Bitcoin Machine Learning. Bitcoin Machine Learning. Bitcoin Machine Learning. Bitcoin … Continue reading Bitcoin Machine Learning.
Oh how the tables have turned. I now interview candidates for data science jobs. I have a sense of humor and can appreciate the irony of going from complaining about job interviews to now being one of those interviewers. I recently deleted a Twitter thread discussing my interview strategy, partly because I agreed with the … Continue reading On Being An Interviewer
My old piece is getting traction thanks to a share on Hacker News, where some of the most insufferable tech guys in California try to dissect in the comments whether I have deep-seated psychological issues. Also, I was mentioned in this blog post at Win Vector LLC, which offers a fair and very good critique … Continue reading Retrospective on “On Moving From Statistics to Machine Learning”
The U.S. Weather Service has always phrased rain forecasts as probabilities. I do not want a classification of “it will rain today.” There is a slight loss/disutility of carrying an umbrella, and I want to be the one to make the tradeoff. Dr. Frank Harrell, https://www.fharrell.com/post/classification/ This is coming from personal experience and from multiple … Continue reading Why Do So Many Practicing Data Scientists Not Understand Logistic Regression?
Coding is computer science in the same way that buying something at the store is economics, or talking to your neighbor is sociology. Buying a widget at the store is governed by dynamics described by economics. We can use economics to answer questions like “why was the widget priced the way it is?” or “why … Continue reading Coding is Not Computer Science