Ethics in Data, Weekly Reflections
Provided in 6 weekly installments, we will cover current and relevant topics relating to ethics in data
In the next 6 weeks, I will be writing about relevant topics pertaining to ethics in data. Being a software developer, AI and ML practitioner, I need to exert more effort to understand the ethical implications of the industry that I belong to and the work that I do.
In June 2021, GitHub released Copilot, a code completion tool that uses machine learning to write code for you. Marketed as an AI Pair Programmer, it’s intended to help developers write code faster. Copilot has been trained on billions of lines of code from various sources, including public Open Source projects in GitHub and other public repositories.
Nearly a year to the date, Amazon released a preview of Amazon Code Whisperer, a similar tool to Copilot.
As a software developer, I am excited about the possibilities of these tools. I am always on the lookout for tools that can help me write better code. As awesome as these tools are, I am also concerned about their ethical implications. Being trained on Open Source code, what will this mean to the source code that it has generated?
From the point of view of GitHub and Amazon, it makes great business sense, as it uses open source code freely available on the web. It’s akin to a Mining company that mines and exploits resources for free, then selling it for a huge profit.
I am almost certain that the original authors did not intend their code to be used in this manner. There is a class action filed against Copilot on behalf of the millions of GitHub users whose code was used to train the tool. The outcome of this litigation could have significant implications for the these products, the millions of developers already using them, and the source code authors of code used to train these products.
Let’s start this second week reflection by watching this short video. It’s about a company called Humanyze, where they use digital badges to track the movements of employees in the workplace. Calling their technology People Analytics
, they claim that it can help companies improve productivity and employee engagement. The device hears and knows everything you are doing, for every second one spends in the office.
It is an older video, however this technology is still being used by thousands of organisations today. It begs the question, is this ethical? There are few ways to dissect this question, but in this instance let’s use a simple framework to help us come up with an answer.
The framework to help us answer this ethical dilemma is called Deontological Framework. It is actually quite straightforward to apply as it only requires people to follow the rules and simply do their duty. With this framework, we don’t even have to think about the consequences of our decisions.
For example, in this scenario, as long as the company can prove that they have consent of the workers using the device, and not against anything illegal, then it is ethical. Humanyze claims that all the data collected are not listened
for content, but rather only for looking in patterns of interaction, and no identifiable information is collected at all. As to the organisations using these badges in their offices, they come from a good place, of not putting their employees under surveillance
, but rather helping them enjoy their work more.
All the data collected is 100% anonymous. It helps identify and diagnose issues in the workplace, those that can affect performance and employee engagement. It can then quantify all the costs, opportunities and risks, so that the company can understand the impact of the changes and their decisions.
Thousands of organisations have reported significant increases in productivity and employee retention across the board, so they must be doing something right.
What do you think, do you think it is ethical?
Provided in 6 weekly installments, we will cover current and relevant topics relating to ethics in data
Get your ML application to production quicker with Amazon Rekognition and AWS Amplify
(Re)Learning how to create conceptual models when building software
A scalable (and cost-effective) strategy to transition your Machine Learning project from prototype to production
An Approach to Effective and Scalable MLOps when you’re not a Giant like Google
Day 2 summary - AI/ML edition
Day 1 summary - AI/ML edition
What is Module Federation and why it’s perfect for building your Micro-frontend project
What you always wanted to know about Monorepos but were too afraid to ask
Using Github Actions as a practical (and Free*) MLOps Workflow tool for your Data Pipeline. This completes the Data Science Bootcamp Series
Final week of the General Assembly Data Science bootcamp, and the Capstone Project has been completed!
Fifth and Sixth week, and we are now working with Machine Learning algorithms and a Capstone Project update
Fourth week into the GA Data Science bootcamp, and we find out why we have to do data visualizations at all
On the third week of the GA Data Science bootcamp, we explore ideas for the Capstone Project
We explore Exploratory Data Analysis in Pandas and start thinking about the course Capstone Project
Follow along as I go through General Assembly’s 10-week Data Science Bootcamp
Updating Context will re-render context consumers, only in this example, it doesn’t
Static Site Generation, Server Side Render or Client Side Render, what’s the difference?
How to ace your Core Web Vitals without breaking the bank, hint, its FREE! With Netlify, Github and GatsbyJS.
Follow along as I implement DynamoDB Single-Table Design - find out the tools and methods I use to make the process easier, and finally the light-bulb moment...
Use DynamoDB as it was intended, now!
A GraphQL web client in ReactJS and Apollo
From source to cloud using Serverless and Github Actions
How GraphQL promotes thoughtful software development practices
Why you might not need external state management libraries anymore
My thoughts on the AWS Certified Developer - Associate Exam, is it worth the effort?
Running Lighthouse on this blog to identify opportunities for improvement
Use the power of influence to move people even without a title
Real world case studies on effects of improving website performance
Speeding up your site is easy if you know what to focus on. Follow along as I explore the performance optimization maze, and find 3 awesome tips inside (plus...
Tools for identifying performance gaps and formulating your performance budget
Why web performance matters and what that means to your bottom line
How to easily clear your Redis cache remotely from a Windows machine with Powershell
Trials with Docker and Umbraco for building a portable development environment, plus find 4 handy tips inside!
How to create a low cost, highly available CDN solution for your image handling needs in no time at all.
What is the BFF pattern and why you need it.