GDPR: "May 25th is when it really begins
The monitoring of compliance with the GDPR will start in just over a month. What does that mean for your work as a developer or data scientist? "You have to continuously ask yourself what data you really need, and how you can minimize the impact of its use."
The GDPR doesn't provide clearly formulated rules, but frameworks. And that's precisely what's making the regulation complicated for organizations, according to Juliette van Balen, lawyer and privacy specialist with IP Advocaten: the GDPR or AVG in Dutch demands an organizational change. "The law therefore asks for a vision on how organizations want to handle their data and customers. Next, leadership is needed to make that vision concrete and implement it in the processes."
"The law isn't about privacy, as many people think, the law is about protecting personal information and therefore how we handle data. The law doesn't say what we can and can't do. In fact, a lot is allowed, as long as you can explain it, make the right privacy assessments and have taken mitigating measurements." And this is precisely what developers and data scientists have to keep in mind when doing their work.
Clients will demand it, if only because the GDPR asks of them that they only outsource their data to service providers who can prove that they comply with the GDPR. Therefore, it's smart to follow a GDPR training as a developer or data scientist, so you can show your clients a certificate and advise them.
For developers, the GDPR isn't really a challenge in a technological way, says Dave van Stein, self-proclaimed security & privacy geek with IT-company Xebia. "More will be done with encryption and hashes, but we already have a lot of experience with that. The what is therefore much less interesting than the how. The mandatory privacy assessment has an impact on your software development; how do you deal with that as a professional, but also as an organization?"
When you work in an agile way and do short-term sprints all the time, the assessment can limit you, van Stein warns. "If you always have to send a form to your privacy officer, that will be a huge delay."
A developer therefore needs a mandate and the trust from the privacy officer to be allowed to make his own assessments. He also needs to be given the right tools in order to do such a self-assessment. How the privacy assessments are then arranged depends entirely on the type of organization.
There aren't many standard solutions, but according to Van Stein you can integrate the assessment fairly easily into the existing processes of developers. "Weave it into the Jira Flow user stories and appoint an owner. Then, you won't have to develop a new process."
Aside from the privacy assessment, another thing needs to be added to the existing processes: a processing register. Van Stein, "You have to be able to show, for each data element that you use, how it was collected and what its lifespan will be within your organization. Only then can you judge if it still fits within the context in which you collected it when the data is processed further."
Not much tooling is available for such a register yet either. "You can build it in your own infrastructure or have it built and work with, for example, automatic contracts. Application A then has to get explicit consent from application B in order to gain access to the data within. A few progressive companies are already experimenting with a metadata register.
Data Scientists and The GDPR
For data scientists, the law isn't all that shocking, Giovanni Lanzani, Chief Science Officer with GoDataDriven says. "The current law is already pretty strict about personal information. Personal information has to be processed in accordance with the law. You already need permission to use certain data and at least a legitimate interest. There's no new rules for profiling, either. What is new, is that when you make automated decisions based on data, your customers get the right to explanation as to how that decision was made. Provided that the decision has a significant impact on the lives of people." Think, for example, about automated decisions in recruitment or the granting of loan. Profiling for direct marketing is still possible. However, the GDPR gives consumers the right to appeal it.
This part about profiling in the law requires that data scientist know very well which data elements are used for a judgement and how they influence the algorithm's decision. "But it's also expected that you know this now as a data scientist, because that's the only way to develop good algorithms."
More Defensive Coding
Lanzani does expect that data scientists will start coding more defensively due to the GDPR, and will therefore build in many extra checks. "Imagine you have an online store and want to send out a marketing email. For that, you get a dump from the online store data. If things are as they should be, the people who deliver that dump have filtered out the people who have indicated that they don't want to receive emails, but in large organizations these processes don't always work perfectly. I can imagine that you will build in extra checks now. A metadata register can help you with this."
The right to data portability also means the GDPR offers opportunities for data scientists, Lanzani says. "If consumers can actually take their data from one company to the other, we can serve them much better."
100 Percent Compliant
Let it be clear: when organizations say that you want to be 100 percent compliant, there's no point if you don't make clear and deliberate choices and not restructure your processes next. Many organizations aren't there yet.
The GDPR is a good reason for organizations to take a close look at your data storage and usage, says Van Balen. "In order to then start making the necessary changes within that organization. Because May 25th will be when it really begins."
Juliette van Balen, Dave van Stein and Giovanni Lanzani teach the GDPR & Data Privacy training at Xebia Academy. They will each discuss the most important aspects of the law from their own point of view. Click here for data and more information
Follow us for more of this
Are sklearn defaults wrong?
September 03, 2019
Improved wireless coverage using an old router
August 28, 2019
Data Driven Board Game Design
August 23, 2019
Real time analytics: Divolte + Kafka + Druid + Superset
August 22, 2019
DeepCS - Berlin Buzzwords 2019
July 26, 2019
Fairness in AI - Dutch Data Science Week 2019
July 23, 2019