Unlocking AI potential in e-commerce and online business – ethics

Ethics in AI

Welcome! Our “Unlocking AI potential in e-commerce and online business” series aims to provide basic guidance on applying AI (Artificial Intelligence) in businesses which use the internet as a primary source for delivering value to their customers. In this article, we focus on ethical aspects of training and using AI models.

  • AI is biased
  • AI is sexist
  • AI is racist
  • AI is unfair
  • AI is not reliable
  • Can we trust AI? Who is responsible for the decisions it makes?

We see headlines like that more and more frequently. Especially in human resources, in healthcare and in almost every field where people data is being used for machine learning. Our last article revealed what is happening behind the scenes of selected artificial intelligence models. Let’s put these insights into context. Enjoy your reading!


Are you new to AI and Machine Learning? If so, we recommend the first article in our “Unlocking AI potential in e-commerce and online business” dealing with basic concepts.


Black mirror

The Confederation of British Industry published a report recently, recommending companies to “monitor” decisions made by artificial intelligence and conduct an audit of how personal data is being used. Many officials and decision makers are aware of the disruption AI could bring to many industries and everyday life.

AI is not a black box, AI is a black mirror

In the EU, for example, the General Data Protection Regulation (GDPR) could be considered as an official response to new, data-powered technologies. Similarly, the California Consumer Privacy Act (CCPA) sets the bar for U.S. companies. It is probably just a matter of time before lawmakers in other states and countries follow.

In this article, we will not comment on the political dimension of the whole thing. Instead, we will focus on practical information which could be used on a daily basis.

During the last Fall, there was much talk about Amazon scraping its AI recruiting tool. The reason for that was a bias towards the word “women” in CVs. Similarly, the Google Photos tool faced embarrassment caused by machine learning because it labeled a photo of a black couple as “gorillas”. How is something like that even possible? Data.

As you sow…

In a way, we can think of machine learning as a tool for making a mathematical representation of the training data. In her research, Dr. J.J. Bryson, an AI expert from the University of Bath, states, that if we apply machine learning to historical text data, we will implicitly include historical biases as well. Likewise, if we train our NLP (Natural Language Processing) model on texts from the current human discussions, we need to deal with current stereotypes.



AI is not a black box, AI is a black mirror

But here comes the question – who will decide what is still acceptable and what has crossed the line? Is it possible to develop “fair” artificial intelligence? If we want to try, we need to think about related issues:

  • What kinds of stereotypes could be present in the population that produces our data?
  • Is it possible to identify bias in our training data for machine learning?
  • Which features are connected to discrimination and could potentially cause the model to behave unfairly?

The “risky” variables are obviously race, gender and similar “human features”. The problem is that in some cases it is precisely this information that determines the right product or service for the customer. For instance, skin color and skin type could be quite important when training a recommender for cosmetic products.


How to deal with data selection and knowledge representation in AI projects? You can find a few tips in the third article in our “Unlocking AI potential in e-commerce and online business” series!


When preparing the training data, it is good to consider not only the business domain (or the product category) but also the moral boundaries of the target customer segment which will be the source of our data or the end user of our model.

Training the “censor”

Another tough thing for AI is to distinguish what is meant for real and what is just humor, irony or sarcasm. If we think about it, the discovery of such language constructions needs a deeper understanding of the context. Let’s face it, sometimes it is not only difficult for an artificial intelligence. Joking aside, the real problem is when it comes to online content moderation. The Verge nicely describes such troubles in companies like Facebook.

We can always prepare the training data by hand and sort particular phrases as “right” or “bad”; nevertheless, without capturing what exactly makes things funny, our AI will probably just memorize particular sentences without the ability to generalize. It makes much more sense to build a rule-based tool which sends every suspicious content to the moderator for approval or to filter it out, just to be sure.

Falsifying data

Anyway, we need to decide to what extent we are willing to censor free speech. The fundamental question is:

  • Do we want to use AI to improve our products and services or as a tool for enforcing our own beliefs?

In the previously mentioned example of labeling a black couple as “gorillas” we can see how Google approached the problem. Two years after the embarrassment, Google Photos preferred to ignore gorillas completely.

Everything is relative, everything is subjective

We have already suggested that some problems are difficult not only for AI but for humans as well. Are we, humans, intelligent enough to create something which might one day replace our own decisions? Let’s think about how our thought process works and how to use this knowledge to design AI.

Illusion

Rorschach test
Rorschach test

No matter whether we are reading a text or looking at a picture, we always base our decisions on our knowledge and experience but, most importantly, on our own subjective feelings. This principle inspired the Swiss psychologist Hermann Rorschach to create his famous inkblot test. A subject (human) is presented with a series of inkblots and is asked to describe what is on the picture. The individuals’ thoughts and feelings are involuntarily projected into the vague shape of the inkblot.

Using a bit of imagination, we can say that artificial intelligence faces the Rorschach test each time it’s presented with testing data. The input is usually a feature vector – a particular combination of numbers or encoded information. Unfortunately, it often lacks context and the option to ask for more details. The output of the model is basically a “melt-down product” of training samples, which are most similar to the input by the chosen metric.

A duck or a rabbit?
A duck or a rabbit?

Data with multiple meanings is tough for AI as well. This applies to images and language but also to the other types of data such as signal or online events.

Connecting the pieces

As mentioned earlier, our brain uses knowledge and context information in addition to historical data. Let’s illustrate this in these two images. Which bear is alive?

Which bear is alive?
Which bear is alive?

The correct answer is picture 1. What kind of knowledge and contextual information did we use to solve this puzzle?

  • The bear is moving: We know about the existence of such a thing as movement. We are aware that some kind of force is needed for movement to happen. We also know that this force needs some kind of source. We have a basic knowledge of a bear’s anatomy as well. The position of the bear’s body suggests that the source of the movement force is probably the bear itself. Only live bears are able to move on their own.
  • Water splashing and wet fur: We know about the existence of such a thing as water. We are able to detect it because it looks transparent. We know it could be presented in the form of a river or lake. We have a basic knowledge of physics and we know that some kind of force must be applied to make the water splash. We also know that a bear’s fur looks different when it comes into contact with water. The bear’s fur has wet spots and the position of the bear’s body suggests that it’s the source of the force which causes the water to splash. Only live bears are capable of such a thing.
  • The bear is fighting with another bear: We have some basic knowledge about bears and their natural habits such as fighting for food or territory. We know what a fight is and what it looks like. We also know that it could occur between two live individuals. Only live bears are capable of fighting.

How do we know that the bear in picture 2 is not alive? Apart from the water splashing and the fighting, we could apply several thoughts in the same way as in picture 1. The bear looks quite natural and it also seems to be moving. The key evidence is the small label stating “Grizzly Bear” which is placed in front of the bear. We know about the existence of such things as a museum and taxidermy. We also know that in the museum, exhibits are usually marked with labels. Live bears don’t use labels.

What did we just do?

  1. A decomposition of the problem into parts
  2. Each part of the problem was solved separately, using relevant domain knowledge and experience
  3. A synthesis of partial solutions into the final result

In this case, our decision-making process could be described as a fallback model which returns the most probable option and improves its accuracy with every further piece of evidence.

We can use a similar approach to machine learning when solving a complex problem. The final solution can be a combination of several models specializing in particular domains. As well as image recognition, fake news detection could be used as an example for this case – suspicious content can be decomposed into single statements, these statements can be validated, and these partial results can be synthesized into the final decision. Our success depends on how well we are able to capture the context information using available data.


How to successfully execute AI-related projects? You can find a few tips in this article in our “Unlocking AI potential in e-commerce and online business” series!


I know how to deal with ethical issues in AI, what next?

If you want to benefit from AI, you need a few things; proper understanding of an AI project life cycle, the right data and the right people. We will cover all these aspects in our series “Unlocking AI potential in e-commerce and online business”. Stay tuned!

Subscribe

Please select all the ways you would like to hear from us:

You can unsubscribe at any time by clicking the link in the footer of our emails. For information about our privacy practices, please visit our website.

We use Mailchimp as our marketing platform. By clicking below to subscribe, you acknowledge that your information will be transferred to Mailchimp for processing. Learn more about Mailchimp's privacy practices here.

Unlocking AI potential in e-commerce and online business – basic concepts

Welcome! Our “Unlocking AI potential in e-commerce and online business” series aims to provide basic guidance on applying AI (Artificial Intelligence) in businesses which use the internet as a primary source for delivering value to their customers. In this article, we focus on explaining some terms and buzzwords which are often used in connection with AI. Enjoy your reading!

How can we profit from AI?

This is a question many CEOs and CMOs are asking us. No wonder, according to a recent article published by Google:

  • 85% of executives believe AI will allow their companies to obtain or sustain a competitive advantage
  • 66% of marketing leaders agree automation and machine learning will enable their team to focus more on strategic marketing activities

How to successfully execute AI-related projects? You can find a few tips in the second article in our “Unlocking AI potential in e-commerce and online business” series!


AI has been around for 60 years and evolved from simple rule-based systems into advanced mathematical models, capable of adapting to changing conditions in a real environment. Powerful hardware in combination with data availability enables machine learning solutions to be developed for almost every e-commerce and online business. Nevertheless, there are several questions each executive should ask before starting an AI-oriented project:

  • What exactly does AI mean in the context of our business?
  • Which phases will the AI project have and what kind of preparation is needed?
  • Should we build an in-house team? What tasks should we outsource?
  • How do we measure progress?
  • Will the investment into AI pay off?

The answers to these questions leads to a precise plan of how to get value from AI and machine learning; we will help you get them.

Is AI just a fancy name for statistics?

Is AI just a fancy name for statistics?

One issue with Artificial Intelligence is that there is no exact definition of what it really stands for. The most common approach is to relate AI to human behavior:

Artificial Intelligence is the science of making machines do things that would require intelligence if done by men.” (Marvin Minsky, 1967)

But here comes the problem – is human behavior always intelligent? Can we expect “rational” behavior from our customers when developing a predictive machine learning solution? The most important part of each AI project is the initial business analysis which should reveal general aspects of the problem we are trying to solve. We want experiments and simulation to be as realistic as possible, otherwise, we are at risk of developing an inconsistent mathematical model with poor performance in a real environment. The amount of customer irrationality and the level of abstraction of the business case indicates the requirements on the solution.

Human behavior vs. intelligent behavior

Based on the complexity, we can split AI projects into four categories:

  1. Rule-based systems without internal representation of the real world: simple IF-THEN rules; e.g. an e-mailing program which sends messages based on the day of the week and the customer demographic characteristics
  2. Rule-based systems with internal representation of the real world: similar to the previous one, but enhanced by definition of states and relations between the entities in the environment; e.g. a recommendation engine recommending products to the customer based on his or her recent online journey
  3. Machine learning systems without continuous learning: a mathematical model is trained on historical or prepared training data; no further learning is done after deployment; e.g. a random forest for churn prediction trained on the historical behavioral data
  4. Machine learning systems with continuous learning: a mathematical model is updated continuously, using recent or real-time data; e.g. a dynamic customer segmentation used by social media companies

Therefore, as a scientific discipline, AI covers everything from simple decision trees to advanced mathematical models.

Sometimes people ask us “Hey, can we borrow your AI for a while and test it?” – we should point out that AI is much more like an approach to problem-solving instead of one particular computer program or algorithm. From a philosophical point of view, we can define two types of AI:

  • Weak AI: a simulation of human decision making process using mathematical models and available data – this basically covers all instances and implementations of AI nowadays
  • Strong AI: a machine or algorithm which realizes its own existence, has its own desires and goals and can reproduce itself (make another machine or algorithm with similar properties) – this is the subject of philosophical and scientific debates and the inspiration for Sci-Fi books and movies, nevertheless, we will probably not see anything like this in the near future

How to deal with data selection and knowledge representation in AI projects? You can find a few tips in the third article in our “Unlocking AI potential in e-commerce and online business” series!


How to leverage AI in e-commerce?

There are many drivers of success in e-commerce and internet companies. Based on data sources and the nature of the AI solution we can distinguish two main directions:

  1. Improvement of customer satisfaction
  2. Optimization of internal processes

1. Improvement of customer satisfaction

A happier customer leads to bigger profits. Better personalization and customer support leads to a happier customer. AI could be a real game changer in these areas.

Marketing persona
Marketing persona

The most commonly known type of solution is probably a product recommendation engine which chooses products to show to the customer based on their transaction and browsing history. In some cases, that alone can boost online sales by tens of percent. Besides the product recommendation, almost every piece of online communication, including banners, navigation and messages, could be personalized in real time using the right data.

Advanced algorithms can dynamically define marketing personas or identify the ones you have already created. It is much easier to increase sales when you know which prospects are ready to buy. Customer segmentation is a big deal and can be a basis for the whole marketing strategy.

2. Optimization of internal processes

Another way to leverage AI in e-commerce is to use NLP techniques (Natural Language Processing) for semantic modeling and key-word optimization. This will help you create product tags and categories automatically. Many companies are doing this manually, employing several people to read product descriptions each time a new product comes out. This is not just time consuming, it also results in the creation of lots of synonyms among tags and keywords, which makes further semantic analysis almost impossible and reduces performance of the search engine. With NLP, you can standardize the whole process, save time and money and improve the product search for your customers.

NLP techniques - word embedding
NLP techniques – word embedding

Many E-commerce companies use a BI (Business Intelligence) solution to support their strategic decisions. If you are one of them, it may interest you that more and more manual BI related tasks are being replaced by machine learning driven automation. In fact, it is quite easy to delegate simple BI tasks to AI algorithms so that your experts can focus on the tough ones. Typical areas for this kind of optimization are pricing, buying, logistics and warehousing. You can also use smart algorithms as an alarm system to detect unusual development among your data, either for discovering possible threats or interesting business opportunities.

Should we implement AI in our company?

If you want to benefit from AI, you need a few things; proper understanding of an AI project life cycle, the right data and the right people. We will cover all these aspects in our series “Unlocking AI potential in e-commerce and online business”. Stay tuned!

Subscribe

Please select all the ways you would like to hear from us:

You can unsubscribe at any time by clicking the link in the footer of our emails. For information about our privacy practices, please visit our website.

We use Mailchimp as our marketing platform. By clicking below to subscribe, you acknowledge that your information will be transferred to Mailchimp for processing. Learn more about Mailchimp's privacy practices here.