Go Data Driven BLOG!

Moving From Excel to R

13 Feb

At Data Driven Commerce Vincent Warmerdam talked about the value of open source data tools in a modern-day workplace.

Read more...


GoDataDriven Open Source Contribution: February 2017 Edition

03 Feb

I find that, for a service company, we are quite active in the open source world. However this remains pretty hidden in practice. So I thought that I can start to change that by publishing, every once in a while, the various contributions we make to open source projects, both old and new.

Read more...


Monitoring HBase With Prometheus

29 Jan

This article shows how to monitor HBase using Prometheus by exposing an HTTP server which serves JMX beans in the Prometheus metric structure and visualize the metrics in Grafana.

Read more...


How to Land a Job in Data Science

27 Jan

I was recently invited to give a talk at the PyData Amsterdam meetup. There I talked about how you can land a job in data science.

Read more...


How to Write Code Using The Spark Dataframe API: A Focus on Composability And Testing

27 Jan

I was recently thinking about how we should write Spark code using the Dataframe API. In this post I'll guide you through the different options

Read more...


Join Us on February 23rd for Google Hashcode

23 Jan

Fun algorithms/optimization competition where you can solve real Google problems

Read more...


Use a SSH-key to access your cloud resources with socks-proxy

08 Jan

Securely access your cloud resources with a socks-proxy. Example how to create a SSH-key and use the public key to create a new Linux machine. Configure a sock proxy to access the remote websites.

Read more...


Solving hard data problems with causal data science

29 Dec

It is tempting for organizations to find biased answers in their data and draw faulty conclusions, like mixing causation with correlation. Adam Kelleher, lead data scientist at Buzzfeed, emphasizes that this is not without risk.

Read more...


Bringing models into production

03 Dec

In Data Science, software quality often is an issue that prevents models to hit production. How can you successfully bring data science models into production?

Read more...


Devoxx 2016

26 Nov

Devoxx goes beyond Java with machine learning, streaming apps and data, and cloud.

Read more...