Welcome! Our “Unlocking AI potential in e-commerce and online business” series aims to provide basic guidance on applying AI (Artificial Intelligence) in businesses which use the internet as a primary source for delivering value to their customers. In this article, we focus on the processes and workflow in AI-related projects. Enjoy your reading!
AI project life cycle
A common beginner’s mistake is considering an AI project to be a regular software development project. That’s why we think it’s important to point out how these two differ in the workflow, goals and deliverables.
Are you new to AI and Machine Learning? We recommend the first article in our “Unlocking AI potential in e-commerce and online business” dealing with basic concepts.
AI development is not just software development
Sometimes we work with companies that are trying to use tools and procedures typical for engineering processes when it comes to prototyping an AI. In those cases, software engineers are in charge of analytical tasks. This often leads to a wrong goal metric, missed deadlines and time wasted on unnecessary meetings.
Software development is like building a house. The goal is obvious from the very beginning, defined via product features whose specification usually does not change during the whole development process. Milestones are clear, progress can be easily measured and each product feature can be tested using unit tests. You can easily split independent tasks among several development teams.
AI development, on the other hand, is iterative research which could be compared to solving a puzzle or finding a path in a maze. The primary goal can be changed repeatedly based on new findings from the data. Instead of product features, there is the precision and consistence of mathematical models which can be tricky to measure correctly. Success or failure depends on the quality of the data and the ability to simulate reality. All these aspects mean you need to bear in mind greater uncertainty which is caused by data variability and changing conditions. In any case, you will gain valuable insights which will help you with strategic decisions and push your business forward. When it comes to creating a team, quality is much more important than quantity. Two or three experts who are familiar with the business domain and state-of-the-art technologies always outperform an army of freshers and junior analysts. That’s because AI projects are about the right ideas and inventive thinking, not just about the number of man-days spent coding.
If you want to leverage AI in e-commerce and online business, you need the right tools. One of them is a good understanding of AI project processes.
CRISP-DM and its role in AI projects
Do you already have some experience with analytics? If you do, you might be familiar with CRISP-DM (Cross Industry Standard Process for Data Mining). It’s been around since 1996, originally crafted by five companies: Daimler AG, Integral Solutions Ltd (ISL), NCR Corporation, OHRA and Teradata. CRISP-DM is a widely used process model which describes workflow typical for almost any data-related project. Thankfully, AI projects are in many ways similar to regular data-related projects.
- A good process for AI development enables quick orientation in the project life cycle and helps you not to forget about important things.
At pbi.ai we use a three phases methodology inspired by CRISP-DM:
- Discovery phase – The first phase focuses on problem understanding and initial data analysis. The main purpose of this phase is to assess available options, make a decision regarding the overall strategy and define clear, quantifiable and measurable goals.
- Prototyping phase – Mathematical modeling, simulations and experiments with AI algorithms happen in the second phase. Most of the time is usually spent on feature engineering and optimization of machine learning hyperparameters.
- Deployment phase – In the final phase, codes of the AI prototype are optimized and deployed into the production environment. Depending on the situation, the final AI program is used as a stand-alone module or integrated into the streaming pipelines and ETL procedures (ETL – Extract, Transform, Load).
Real case scenario
Now that you know the basics, you might be wondering how this works in the real world. Imagine a fictional company named Fictional Online Fashion Store which would like to personalize its website content in real time and offer customers relevant products. Now let’s see how to approach the project from the management point of view.
The essential part of each AI project is understanding the problem. We’ve seen many companies underestimate this and dive head-first into coding complex mathematical models and torturing data to achieve performance slightly better than a random guess. Instead, the most valuable activity at the start of the project is brainstorming and thinking about knowledge representation.
One of the most important tasks is to create an ontology. Ontology is a description of entities and their relations. In our case, this could mean performing a revision of the product tagging, defining custom variables for online tracking and creating a nomenclature for sequences of customer online actions.
Another part of this phase is crafting a data strategy. We need to analyze available data sources and decide whether we are able to get sufficient information about online customer behavior and the data context. In terms of content personalization, it would be good to check the availability and consistency of the historical online events data. It would also be useful to perform a basic correlation and clustering analysis to get initial insights. This will help determine the most promising direction for the next steps and define baseline goals. A well-defined goal in our case could be, for example, to increase the number of product detail views in the target customer segment, during a particular period of time, in the selected location and validated through A/B testing.
How to deal with data selection and knowledge representation in AI projects? You can find a few tips in the third article in our “Unlocking AI potential in e-commerce and online business” series!
Once an overall strategy is crafted, we can proceed to the next phase. Prototyping is a repetitive hypothesis-driven research process focused on finding the best performing combination of mathematical models and features.
In the case of our Fictional Online Fashion Store and the website content customization, we will probably dig into the transaction history. Good features enable a simple and robust AI model to be developed. A simple model based on good features always outperforms a complex model based on bad features. It is also much better in terms of future maintenance and adjustments.
In our case, we need to find patterns in the online shopping behavior and to identify customer and product attributes linked to the buying decision process. Besides features for our model, we will get valuable insights for other marketing tasks like session scoring or lead scoring.
When dealing with online data, make sure its context is examined as well. Common external factors include weather, what competitors offer or trends among celebrities. All these aspects may influence the customer in a particular season or time period and cause anomalies in the data. If we manage to represent this information in our feature vector, we are on the right track to a machine learning model with consistent performance.
The final phase covers implementation of the AI prototype into the production environment and its validation on real data. AI models, for the most part, work as stand-alone modules. Thus, the deployment phase is usually about API and a database scheme to store the model values.
Another task typical for this phase is the development of ETL pipelines to clean data and prepare features for machine learning. It is useful to integrate some logging and define inputs for reporting so that performance can be measured and recorded continuously.
Once everything is ready, we can test the model, check whether it met the goals and decide on the next steps. Make sure you perform A/B testing from time to time to check the model performance. The external context should be reviewed on a regular basis as well in order to identify new factors which influence customers.
I know how to manage the AI project, what now?
Are you interested in learning more about AI and how it can benefit your business? Now that you know the basics of AI project management, you might be interested in how to select the right data sources and machine learning algorithms for particular use cases. We will cover all these aspects in our series “Unlocking AI potential in e-commerce and online business”. Stay tuned!