GoDataDriven open source contribution: April 2018 edition
Welcome to the Open Source at GoDataDriven, April 2018 edition.
Kris worked on his docker-kafka project, contributing PR 6 and 7 to upgrade Kafka version and use openjdk instead of Oracle Java. His docker-kafka project, self contained with Zookeeper, is invaluable if you want to get started. The fact that you can find the instance by name instead of by IP also makes it a great candidate if you want to use it on a training environment (where not all students will be familiar with the intricacies of Docker and IPs).
Fokko instead (he doesn't like to sit idle, does he?) contributed to Homebrew with PR 26793 to upgrade Scala to 2.11.12 (I mean, can you imagine Fokko running an outdated version of Scala? 😏). He went on with work on Airflow with PR 3252 and 3201. To close it, he improved Divolte with PR 203, 215, and 216.
To conclude the show, I also tried to make Airflow better: first I open sourced hmsclient, a Python package to interact with the Hive metastore. Airflow was, in fact, using a deprecated client for all the interactions with the metastore. As a result, the part of Airflow interacting with the metastore was not Python 3 compatible. With PR 3239 — by me — that is now fixed.
That's it for this edition! Don't forget we're hiring! Especially if you are a software engineer that would like to move in the data space, get in touch!
And if you want more rambling throughout the month, follow me on Twitter: I'm gglanzani there!
Follow us for more of this
GoDataDriven open source contribution: December 2018 edition
January 14, 2019
Using the Airflow Experimental Rest API to trigger a DAG
January 12, 2019
Apache Airflow graduation as Apache Top-Level
January 08, 2019
Data Survey 2018/2019 - Data 50
January 07, 2019
Use a SSH-key to access your cloud resources with socks-proxy
December 31, 2018
Looking Back at our Deep Learning Frenzy
December 28, 2018