Interview with Cloudera CEO: Four developments that impact digital transformation
At the end of March, Strata + Hadoop World took place in San Jose. Forbes had an interesting interview with Tom Reilly, chief executive of Cloudera. In this post we highlighted four developments that have a direct impact on digital transformation.
These developments, of course in a random order, are:
- The hyper connected world
- Shifting from technology to use cases
- Data science as competitive advantage
- Cloud storage
The hyper-connected world
In the past five to seven years the world has become hyper-connected. The opportunities of these developments are massive. At the same time, these shifts pose a threat for traditional organizations, as they need to re-engineer their business for this new interconnected world. The first step is to embrace new sets of ideas and make a paradigm shift.
Tom Reilly gives an example of an auto manufacturer. “If you’re not building a connected car, you are in trouble. So auto manufacturers are quickly becoming data management companies.” Bottom-line is that every enterprise becomes a technology company, no matter what business you are in.
In this new era of computing, both large enterprises and startups are adopting Hadoop. By implementing and supporting Hadoop, large organizations validate this technology, leading to an expansion of the market.
Shifting from technology to use cases
In 2014 everyone talked about technology, and this conversation shifted to the combination of technologies to create a data hub. Some call the central data platform a data lake, to illustrate CIO’s where the data platform fits in with the data landscape.
Now in 2016, Cloudera, much like us, sees boardroom use cases emerging. These use cases often come down to customer insight, Internet of Things data-driven products and services, and lowering business risk.
According to the Cloudera chief executive, the worst thing an enterprise could do is to turn to legacy solutions to solve these problems, because these use cases involve new data sets. Technology like Hadoop is designed for those data sets.
Data Science as competitive advantage
The typical evolution of software products involves the development of features that replace custom services, like Data Science. Tom Reilly however, does not think that packaged applications will be developed on top of Cloudera.
The following is an example of usage-based insurance that Tom Reilly gave during the interview: “Algorithms that determine what your product will be when you say “pay when you drive” will be the competitive advantage. Buying packaged algorithms doesn’t make sense, you want to have you own data scientists to create algorithms and use that as a competitive advantage”.
This is in line with our experience that enterprises can exchange technology, but should protect their algorithms, as it is the latter that provides the real competitive advantage. Just like in this use case of a Dutch e-retailer.
Cloudera aims at packaging as much of the building blocks as they can and then provide data scientists with the tools to determine their specific offerings. Partners like GoDataDriven are essential to provide the specific expertise needed to develop successful use cases for each vertical industry, like financial services, retail, healthcare etc.
Cloudera sees that when ecosystems partners are involved in a project, these tend to be larger and more successful faster.
For Cloudera, public cloud is the fastest-growing environment. With the overall business roughly doubling, the cloud business even doubles that growth, accounting for 15 percent to 20 percent of the customer workload.
Cloudera sees that most of their customers are operating in a hybrid environment. For example, new data sets from mobile and social data, which are being generated outside the data center in the cloud, are integrated with data from the internal data center, like customer contracts, historical records, and customer support cases.
It’s hard for banks or healthcare companies to move their data to the cloud. Especially data that was created in the data center is difficult to move from local storage to the cloud. Mass is one of the reasons fort his. But it’s not only locally created data that has mass, data that originated in the cloud also has mass. All new data is in the cloud. As more and more data will be created in the cloud, it’s no more than logical that cloud storage will become increasingly important.
Follow us for more of this
Testing and debugging Apache Airflow
February 22, 2019
The Zen of Python and Apache Airflow
February 18, 2019
AWS Machine Learning Competency Status for GoDataDriven
February 14, 2019
GoDataDriven Open Source Contribution for January 2019, the Apache Edition
February 13, 2019
Our social responsibility as a company
February 08, 2019
Keras: multi-label classification with ImageDataGenerator
January 31, 2019