Go Data Driven BLOG!

Real time analytics: Divolte + Kafka + Druid + Superset

18 Aug

Divolte is our open source click stream collector which enables companies to gain insights in the usage of their products. These insights are easily visualized using Superset as an interactive slice and dice tool, utilizing Druid as a scalable backend.

Read more...


Airflow Tutorial for Data Pipelines

11 Aug

This tutorial walks you through the basics of setting up and using Airflow, and it will give you some practical tips.

Read more...


GoDataDriven open source contribution: July 2017 edition

31 Jul

Welcome to the Open Source at GoDataDriven, July 2017 Edition.

Read more...


Continuous Deployment of Python eggs with VSTS on Azure

28 Jul

Create a basic continuous deployment pipeline for Python code with Visual Studio Team Services.

Read more...


Hadoop and LDAP, as seen through Venetian blinds

01 Jul

My wife recently asked me to mount new Venetian blinds in the kids' bathroom. I thought that I'd be done in five minutes, but two hours later I still had to drill a single hole.

Read more...


GoDataDriven open source contribution: June 2017 edition

30 Jun

Welcome to the Open Source at GoDataDriven, June 2017 Edition.

Read more...


Vendor Free Data Science

19 Jun

We often get asked by clients what vendor solution(s) we propose for their data science need. In this blog post, I try to summarize why the answer is (almost always) some open source tool.

Read more...


ReveRse engineering BoardGameGeek

16 Jun

Reverse engineering the secret rating algorithm of BoardGameGeek.com

Read more...


Don't be a lonely document

09 Jun

Emil Eifrem's famous quote "Don't be a lonely document" inspired me to find out who were the most influential Twitter users at Graphconnect.

Read more...


GoDataDriven open source contribution: May 2017 edition

30 May

Welcome to the Open Source at GoDataDriven, may 2017 edition. This month we do begin by standing on the shoulders of giants!

Read more...