Machine Learning is an application within artificial intelligence that consists of a collection of models, methods and algorithms that find patterns in big data and help organizations make better business decisions. Decisions made using machine learning are driven by data instead of hunches or suspicions. This is the first certificate in a series of Applied certificates on Machine Learning for Artificial Intelligence. In this series, students will gain a practical understanding of the tools and techniques used in Machine Learning. In this course, students are introduced to supervised machine learning classification methods and algorithms used in Artificial Intelligence. Students will learn to use AWS Cloud Technologies, Python and Jupyter Notebook to train and deploy supervised machine learning models built using Decision Trees, Ensembles, Nearest Neighbor and Support Vector Machines for identifying customer marketing profiles, detecting fraudulent transactions, image recognition and gain a better understanding of the specific importance of features within data. Students must complete all 5 modules to complete the certificate.
Module 1: Gaining Inference from Decision Trees
Students will use machine learning to build an inference model for discovering relationships in customer demographic data. This information will be used to build a marketing segmentation model for identifying potential customers earlier in the sales process.
1. Environment Setup
2. Introduction to Supervised Machine Learning and the Machine Learning Process
3. Data Preprocessing and EDA
4. Introduction to Sci-Kit Learn, Jupyter and SageMaker
5. Storytelling with Decision Trees
Module 2: Building predictive classification machine learning models with decision trees
In this task, students will use decision trees and machine learning to build a predictive model to classify fraudulent bank transactions from non-fraudulent transactions and assist a local financial institution with their relevant fraud and security protection services.
1. Introduction to Classification, the Accuracy Paradox and Unbalanced data
2. Feature Selection
3. Bias / Variance Tradeoff
6. Assessing models with Precision and Confusion Matrices
Module 3: Classifying potential customers and prioritizing valuable customer information with ensemble learning methods
In this task, students use Random Forests and Gradient Boosted Trees to assess various aspects of existing bank customers to assess if predictive modeling can assist stakeholders with building a new process for identifying potential customers. In addition, predictive modeling will be used to better understand how individual customer aspects should be prioritized when marketing the new process.
1. Introduction to ensembles, boosting and bagging
2. Feature Engineering
3. Gradient Descent
4. Assessing ensemble methods
Module 4: Building an image classification system for handwriting recognition using Lazy Learning
In this task, students are asked by a local manufacturer to assess the viability of building an image classification system that can potentially identify handwriting from images of different written digits. Students will use K-Nearest Neighbor to find relevant patterns in data taken from actual handwriting images and the related outcomes.
1. Introduction to Lazy Learning
2. Majority Voting and Euclidean Distance
3. Estimating K
4. Model Deployment
Module 5: Building an early cancer detection system using Support Vector Machines
In this task, students try to successfully classify cancerous tumors as being either malignant or benign using data and features from images of biopsied cells. If successful, this system could be used to detect cancers earlier and help improve the impact of receiving a potentially life-saving intervention.
1. Understanding Support Vector Machines
2. Introduction to Kernelling
3. Working with AWS S3 and Sagemaker
4. ROC/AUC Scores
5. Model Deployment
1. Using machine learning tools to investigate patterns in complex data sets
2. Preprocessing data for machine learning tasks
3. Using on-prem and cloud methods for supervised learning
4. Using decision tree classifiers to investigate classification problems
5. Using Ensemble Learning to build predictive models using Boosting and Bagging
6. Assessing the predictive performance of predictive models using Cost Functions and Confusion Matrices
7. Confirming relationships between learner performance and measured features to help understand model performance
8. Conducting feature selection to investigate the correlation between different features in a dataset
9. Deploying machine learning models to production
10. Presenting machine learning results to a non-technical audience