There are endless blog posts out there describing the basics of linear regression and penalized regressions such as ridge and lasso. These are useful resources, and I’m happy they exist to level the playing field both for people not in college and for people who don’t have the time or fortitude to trudge through mountains … Continue reading Some Things You (Maybe) Didn’t Know About Linear Regression
This recent Tweet erupted a discussion about how logistic regression in Scikit-learn uses L2 penalization with a lambda of 1 as default options. If you don’t care about data science, this sounds like the most incredibly banal thing ever. If you do care about data science, especially from the statistics side of things, well, have … Continue reading Scikit-learn’s Defaults are Wrong
The last time I wrote about a specific interview problem, I admired the problem for being really cool and challenging, but complained about how the salary was not proportional with the difficulty of the problem. I have no such complaints this time around. This latest problem was for a job with a solid base salary … Continue reading A Cool SQL Problem: Avoiding For-Loops
I’ve been applying for a lot of jobs lately, so naturally I have thoughts about interview questions. I’ve already written about one of those questions, and I eventually want to write about what makes an interview question good or bad. That said, it’s impossible to do that without talking about FizzBuzz, the greatest and most … Continue reading FizzBuzz, Redux
I’ve spent the last few months preparing for and applying for data science jobs. It’s possible the data science world may reject me and my lack of both experience and a credential above a bachelors degree, in which case I’ll do something else. Regardless of what lies in store for my future, I think I’ve … Continue reading On Moving from Statistics to Machine Learning, the Final Stage of Grief
Stata code: Output: In related news, Andrew Gelman is right.
The Problem Your office needs to figure out the minimum number of rooms required to organize meetings for any particular day. To accomplish this task, you have a table with the following information: a meeting ID, a start time, and an end time: For any table of that format, figure out the number of rooms … Continue reading A Cool SQL Problem (And Why It Is Also a Bullshit SQL Problem)