How to Land a Job in Data Science
I was recently invited to give a talk at the PyData Amsterdam meetup1. There I talked about how you can land a job in data science.
This is not very surprising as data science jobs are "hot" on the market and many people felt attracted by the perspective of knowing how to get one.
For them, the talk might have been a disappointment (see slides below and here) as the take home message was that you need strong software development skills to be a good data scientist. And if you already have strong software development skills, you were probably not interested in the talk to begin with.
We can endlessly discuss what a good data scientist means, but in my opinion it means someone that is also able to think in terms of productionizing code, and the trade offs that productionizing implies. This is more nuanced in real life (I mention that in the presentation) but it's a good starting point.
The audience had a lot of questions at the end, but one category of questions stood out:
If I have a background in software engineering or computer science, how do I get started with machine learning?
My answer to that was: don't try to become an expert in machine learning. Familiarize yourself with it, and then try to be a Machine Learning Engineer. This profile helps the data scientists put models into production, chooses the right data storage for the input and output of the model, advises on the APIs, etc.
These are all areas where software engineers are much stronger that the typical data scientist and I cannot overemphasize enough their added value.
There's is an extreme shortage of this profile: GoDataDriven, and many other companies, are continuously trying to hire them, more urgently than data scientists! Somehow, though, they're rarer than unicorns!
In case you're one of them, or aspire to be one, don't forget to stop by our career page.
Let me know what you think, especially if you disagree (I'm @gglanzani on Twitter if you want to reach out!).
Ok, as the co-organizer that's a curious choice of words. ↩
GoDataDriven Open Source Contribution: March 2017 Edition
March 08, 2017
Using Druid With a Continuous Integration Pipeline
March 05, 2017
How to Start a Data Science Project in Python
March 01, 2017
Facebook's Prophet: Forecasting Stores Transactions
February 25, 2017
BI Platform Interviews Giovanni Lanzani
February 16, 2017
Import Partitioned Google Analytics Data in Hive Using Parquet
February 14, 2017