How I Learned Python

Every once in a while, people ask me for my recommendations on how to learn Python. I don’t think I’m a good source for this because I learned computer science in high school, dabbled in statistical coding (Stata and Matlab), and got back into “real”programming through Python. This is a much different path than someone … Continue reading How I Learned Python

Some Things You (Maybe) Didn’t Know About Linear Regression

There are endless blog posts out there describing the basics of linear regression and penalized regressions such as ridge and lasso. These are useful resources, and I’m happy they exist to level the playing field both for people not in college and for people who don’t have the time or fortitude to trudge through mountains … Continue reading Some Things You (Maybe) Didn’t Know About Linear Regression

Scikit-learn’s Defaults are Wrong

This recent Tweet erupted a discussion about how logistic regression in Scikit-learn uses L2 penalization with a lambda of 1 as default options. If you don’t care about data science, this sounds like the most incredibly banal thing ever. If you do care about data science, especially from the statistics side of things, well, have … Continue reading Scikit-learn’s Defaults are Wrong

A Cool SQL Problem: Avoiding For-Loops

The last time I wrote about a specific interview problem, I admired the problem for being really cool and challenging, but complained about how the salary was not proportional with the difficulty of the problem. I have no such complaints this time around. This latest problem was for a job with a solid base salary … Continue reading A Cool SQL Problem: Avoiding For-Loops

FizzBuzz, Redux

I’ve been applying for a lot of jobs lately, so naturally I have thoughts about interview questions. I’ve already written about one of those questions, and I eventually want to write about what makes an interview question good or bad. That said, it’s impossible to do that without talking about FizzBuzz, the greatest and most … Continue reading FizzBuzz, Redux

On Moving from Statistics to Machine Learning, the Final Stage of Grief

I’ve spent the last few months preparing for and applying for data science jobs. It’s possible the data science world may reject me and my lack of both experience and a credential above a bachelors degree, in which case I’ll do something else. Regardless of what lies in store for my future, I think I’ve … Continue reading On Moving from Statistics to Machine Learning, the Final Stage of Grief

Chow tests with quadratic terms on random noise

Stata code: Output: In related news, Andrew Gelman is right.

A Cool SQL Problem (And Why It Is Also a Bullshit SQL Problem)

The Problem Your office needs to figure out the minimum number of rooms required to organize meetings for any particular day. To accomplish this task, you have a table with the following information: a meeting ID, a start time, and an end time: For any table of that format, figure out the number of rooms … Continue reading A Cool SQL Problem (And Why It Is Also a Bullshit SQL Problem)

Stream Python into PowerPoint with pp_stream

I like Python, don’t get me wrong, but it’s not Microsoft PowerPointâ„¢, which is clearly the superior software in every way. Unfortunately, my managers don’t agree with me. When I do things in PowerPoint, they ask things like, “why aren’t you working?” and “why did we ever hire you?” Alas, I’m stuck working in Python. … Continue reading Stream Python into PowerPoint with pp_stream

Execute Python from Excel

Python is great, Excel is great. How cool would it be to run Python from an Excel workbook? “But that’s stupid,” you protest. … Yes, and? Copy and paste this into a VBA module in your Excel workbook, modify the parameters, create a helloworld.py in the same directory you’ve saved the Excel file, and enjoy. … Continue reading Execute Python from Excel