An Enterprise Architect's Guide to Machine Learning Series: Part 3

bridge-4546141_1920

In part 1 of this series, we defined machine learning and made the connection to enterprise architecture. In part 2, we covered the three types of machine learning algorithms. In our third installment, we will explain how to make machine learning algorithms in six steps.

HOW TO CREATE A MACHINE LEARNING ALGORITHM IN 6 STEPS

Although it is not particularly necessary for Enterprise Architects to become junior data scientists, Enterprise Architects looking to bring measurable change to their enterprises must have a general knowledge of trending subjects in order to consult teams on best practices. Below are the steps to creating a machine learning algorithm.

1. Gather the appropriate data

Determine strong variables that you would like to later query for, including log-in frequencies, amount of distinct users, amount of power users, time since the last contact, net promoter score from last feedback etc. Get creative here and think about your business. For example, if your company creates multimedia content for customers, think of incorporating vital statistics about the content, including word counts and post reach. If your company produces marmalade, include the history of the types of jam available, and the average amount purchased in one transaction.

2. Create interfaces between your connected systems that store your data

To extract meaningful value from large sets of data, your enterprise needs many tools and capabilities - analytics, algorithms, and big data processing capabilities. Consider the following: A microservice framework, cloud based servers, platform as a service (PaaS), and containerization. It is important to have access to the most up-to-date information from all relevant systems in order to extract the most meaningful features. You also need to establish a common key for your customer among all the systems. See lessons learned section for more information.

3. Start simple

A simple database management system will suffice for most projects in the beginning (e.g. Amazon RDS, PostgreSQL, or MySQL). You will require such a database and it should be independent of your production environment.

4. Prepare and transform the data

Most algorithms require the input variables to not be dependent, therefore you must transform your input data. Methods to do that would include Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) or Quadratic Discriminant Analysis (QDA). Eliminating dependent variables helps preserve the quality of your data, which is a prerequisite to many methods. This step will improve the accuracy, of your model for later stages.

If you are not into transforming your data, you can still use Random Forests, which do not require you to have uncorrelated inputs (but they tend to perform better on uncorrelated data).

5. Choose a suitable machine learning algorithm

Inform yourself about machine learning algorithms and which suits your challenge. Commonly used machine learning algorithms that can be applied to almost any data problem:

Linear Regression
Logistic Regression
Decision Tree
SVM
Naive Bayes
KNN
K-Means
Random Forest
Dimensionality Reduction Algorithms
Gradient Boost & Adaboost

There are numerous methods and algorithms to choose from.

A word of advice: Start simple and get more complicated step by step.

6. Train, test, and re-evaluate the models

This includes dividing the data into three sets for training, testing, and validating. The training stage is used to train the initial machine learning model. The testing stage is for evaluating the trained model: How does the model perform on data which is yet unknown to it? During the testing phase, it is important to calculate accuracy, precision, and recall.

Use a confusion matrix, or an error matrix, which is a specific table layout that allows visualization of the performance of a supervised learning algorithm.

Validation stage - If you have trained different models via different machine learning algorithms, you can pit them against each other by performing the same analysis of accuracy, precision, and recall on the validation set.

In order to run the accuracy tests, it is important to have sufficient data to analyze - a few hundred customers is the minimum.

Analyst Report

2024 Gartner® Magic Quadrant™ for Enterprise Architecture Tools

SAP LeanIX named a Leader in the 2024 Gartner® Magic Quadrant™ for Enterprise Architecture Tools for fourth consecutive year

Get your complimentary report

Gartner®, 2024 Magic Quadrant™ for Enterprise Architecture Tools, By: Andreas Frangou, Austin Steinmetz, Shubhangi Jena, Andrew Gianni, November 20 2024GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally, Magic Quadrant is a registered trademark of Gartner, Inc. and/or its affiliates and is used herein with permission. All rights reserved..
DISCLAIMER: This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from SAP LeanIX. SAP acquired LeanIX in 2023. Recognized as LeanIX in the Magic Quadrant for Enterprise Architecture Tools (2021-2023). Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner research organization and should not be construed as statements of fact. Gartner disclaims all warranties, express or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

An Enterprise Architect's Guide to Machine Learning Series: Part 3

HOW TO CREATE A MACHINE LEARNING ALGORITHM IN 6 STEPS

2. Create interfaces between your connected systems that store your data

3. Start simple

4. Prepare and transform the data

Related Posts

Related Resources