This is the 1st article in a 2-part series on the use of AI in hiring. The 2nd part will be available on Wednesday, January 8th. Arvind Narayanan somewhat recently put out a presentation called “How to recognize AI snake oil.” It’s incredible, and I highly recommend reading it in full. He also has a … Continue reading AI Will Not Reduce Discrimination in Hiring Practices. Does the Public Agree?
Let’s say you wrote a really basic data import function that finds the latest .csv file in a directory (sorted alphanumerically, and the name equals the date), then imports it into a Pandas dataframe, and then does some light processing of the data. The processing is done in two steps: First, it formats the dates. … Continue reading Making Good Code Great
Every once in a while, people ask me for my recommendations on how to learn Python. I don’t think I’m a good source for this because I learned computer science in high school, dabbled in statistical coding (Stata and Matlab), and got back into “real”programming through Python. This is a much different path than someone … Continue reading How I Learned Python
There are endless blog posts out there describing the basics of linear regression and penalized regressions such as ridge and lasso. These are useful resources, and I’m happy they exist to level the playing field both for people not in college and for people who don’t have the time or fortitude to trudge through mountains … Continue reading Some Things You (Maybe) Didn’t Know About Linear Regression
This recent Tweet erupted a discussion about how logistic regression in Scikit-learn uses L2 penalization with a lambda of 1 as default options. If you don’t care about data science, this sounds like the most incredibly banal thing ever. If you do care about data science, especially from the statistics side of things, well, have … Continue reading Scikit-learn’s Defaults are Wrong
The last time I wrote about a specific interview problem, I admired the problem for being really cool and challenging, but complained about how the salary was not proportional with the difficulty of the problem. I have no such complaints this time around. This latest problem was for a job with a solid base salary … Continue reading A Cool SQL Problem: Avoiding For-Loops
I’ve been applying for a lot of jobs lately, so naturally I have thoughts about interview questions. I’ve already written about one of those questions, and I eventually want to write about what makes an interview question good or bad. That said, it’s impossible to do that without talking about FizzBuzz, the greatest and most … Continue reading FizzBuzz, Redux
I’ve spent the last few months preparing for and applying for data science jobs. It’s possible the data science world may reject me and my lack of both experience and a credential above a bachelors degree, in which case I’ll do something else. Regardless of what lies in store for my future, I think I’ve … Continue reading On Moving from Statistics to Machine Learning, the Final Stage of Grief
Stata code: Output: In related news, Andrew Gelman is right.
The Problem Your office needs to figure out the minimum number of rooms required to organize meetings for any particular day. To accomplish this task, you have a table with the following information: a meeting ID, a start time, and an end time: For any table of that format, figure out the number of rooms … Continue reading A Cool SQL Problem (And Why It Is Also a Bullshit SQL Problem)