GoDataDriven Open Source Contribution: February 2017 Edition
At GoDataDriven we have an Open Source First approach. Unless some shockingly good reasons exists, we always advise to use (and implement) open source solutions.
It is therefore only natural that we have the tendency to give back to the open source community. Some of these efforts have a good visibility (such as Divolte), while other remain in the shadow.
I therefore thought that I can start to change that by publishing, every once in a while, the various contributions to open source projects, both old and new.
- In Druid PR 3481 he fixed the INFO message logs;
- In Docker-Druid PR 30 he expanded on the logging section in the README;
- In Airflow PR 2038 he resolved session leakage;
- In Airflow PR 2042 (still open) he added a spark-submit operator/hook;
- In Flink PR 3077 he implemented Stochastic Outlier Selection (!);
- In Flink PR 3081 he cleaned up the Flink Machine Learning library (!!).
Next up is Vincent who create a whole new project, Kadro which is a friendly pandas wrapper with a more composable grammar support. The goal of the library is to have a minimal wrapper that allows most of all dataframe operations to be more expressive by being chainable.
That's all for the first edition. As always, we're hiring Data Scientists and Data Engineers. Head up to our career page if you're interested. You get plenty of opportunities to give back to the community.
A practical example of generators and decorators in Python
November 17, 2017
"I Pity the fool", Deep Learning style
November 05, 2017
GoDataDriven open source contribution: October 2017 edition
October 19, 2017
GoDataDriven open source contribution: September 2017 edition
September 30, 2017
What the GDPR opens up for you!
September 29, 2017
How European Organizations Increase Their Innovation Speed with Smart Data Applications
September 20, 2017