Start Here With Machine Learning
Start Here With Machine Learning
Search...
The most common question I’m asked is: “how do I get started?”
My best advice for getting started in machine learning is broken down into a 5-step process:
Step 1: Adjust Mindset. Believe you can practice and apply machine learning.
What is Holding you Back From Your Machine Learning Goals?
Why Machine Learning Does Not Have to Be So Hard
How to Think About Machine Learning
Find Your Machine Learning Tribe
Step 2: Pick a Process. Use a systemic process to work through problems.
Applied Machine Learning Process
Step 3: Pick a Tool. Select a tool for your level and map it onto your process.
Beginners: Weka Workbench.
Intermediate: Python Ecosystem.
Advanced: R Platform.
Best Programming Language for Machine Learning
Step 4: Practice on Datasets. Select datasets to work on and practice the process.
Practice Machine Learning with Small In-Memory Datasets
Tour of Real-World Machine Learning Problems
Work on Machine Learning Problems That Matter To You
Step 5: Build a Portfolio. Gather results and demonstrate your skills.
Build a Machine Learning Portfolio
Get Paid To Apply Machine Learning
Machine Learning For Money
Many of my students have used this approach to go on and do well in Kaggle competitions and get jobs as
Machine Learning Engineers and Data Scientists.
The benefit of machine learning are the predictions and the models that make predictions.
To have skill at applied machine learning means knowing how to consistently and reliably deliver high-quality
predictions on problem after problem. You need to follow a systematic process.
Below is a 5-step process that you can follow to consistently achieve above average results on predictive modeling
problems:
Probability is the mathematics of quantifying and harnessing uncertainty. It is the bedrock of many fields of
mathematics (like statistics) and is critical for applied machine learning.
Below is the 3 step process that you can use to get up-to-speed with probability for machine learning, fast.
You can see all of the tutorials on probability here. Below is a selection of some of the most popular tutorials.
Probability Foundations Probability Distributions
Introduction to Joint, Marginal, and Conditional A Gentle Introduction to Probability Distributions
Probability Discrete Probability Distributions for Machine
Intuition for Joint, Marginal, and Conditional Learning
Probability Continuous Probability Distributions for Machine
Worked Examples of Different Types of Probability Learning
Statistical Methods an important foundation area of mathematics required for achieving a deeper understanding of
the behavior of machine learning algorithms.
Below is the 3 step process that you can use to get up-to-speed with statistical methods for machine learning, fast.
You can see all of the statistical methods posts here. Below is a selection of some of the most popular tutorials.
Summary Statistics
Introduction to the 5 Number Summary
Introduction to Data Visualization
Correlation to Understand the Relationship
Between Variables
Introduction to Calculating Normal Summary
Statistics
Linear algebra is an important foundation area of mathematics required for achieving a deeper understanding of
machine learning algorithms.
Below is the 3 step process that you can use to get up-to-speed with linear algebra for machine learning, fast.
You can see all linear algebra posts here. Below is a selection of some of the most popular tutorials.
Optimization is the core of all machine learning algorithms. When we train a machine learning model, it is doing
optimization with the given dataset.
You can get familiar with optimization for machine learning in 3 steps, fast.
You can see all optimization posts here. Below is a selection of some of the most popular tutorials.
Calculus is the hidden driver for the success of many machine learning algorithms. When we talk about the
gradient descent optimization part of a machine learning algorithm, the gradient is found using calculus.
You can get familiar with calculus for machine learning in 3 steps.
You can see all calculus posts here. Below is a selection of some of the most popular tutorials.
Python is the lingua franca of machine learning projects. Not only a lot of machine learning libraries are in Python,
but also it is effective to help us finish our machine learning projects quick and neatly. Having good Python
programming skills can let you get more done in shorter time!
You can get familiar with Python for machine learning in 3 steps.
You can see all Python posts here. But don’t miss Python for Machine Learning (my book). Below is a selection of
some of the most popular tutorials.
Basic Language Language Techniques
Some Language Features in Python Command Line Arguments for Your Python Script
More Special Features in Python A Gentle Introduction to Decorators in Python
Python Classes and Their Use in Keras Techniques to Write Better Python Code
Troubleshooting Libraries
Python Debugging Tools Multiprocessing in Python
Profiling Python Code A Guide to Obtaining Time Series Datasets in
Static Analyzers in Python Python
Web Frameworks for Your Python Projects
You need to know what algorithms are available for a given problem, how they work, and how to get the most out of
them.
You can see all machine learning algorithm posts here. Below is a selection of some of the most popular tutorials.
Weka is a platform that you can use to get started in applied machine learning.
It has a graphical user interface meaning that no programming is required and it offers a suite of state of the art
algorithms.
You can see all Weka machine learning posts here. Below is a selection of some of the most popular tutorials.
You can use the same tools like pandas and scikit-learn in the development and operational deployment of your
model.
Below are the steps that you can use to get started with Python machine learning:
You can see all Python machine learning posts here. Below is a selection of some of the most popular tutorials.
R is a platform for statistical computing and is the most popular platform among professional data scientists.
It’s popular because of the large number of techniques available, and because of excellent interfaces to these
methods such as the powerful caret package.
You can see all R machine learning posts here. Below is a selection of some of the most popular tutorials.
You can learn a lot about machine learning algorithms by coding them from scratch.
Learning via coding is the preferred learning style for many developers and engineers.
Here’s how to get started with machine learning by coding everything from scratch.
You can see all of the Code Algorithms from Scratch posts here. Below is a selection of some of the most popular
tutorials.
Many datasets contain a time component, but the topic of time series is rarely covered in much depth from a
machine learning perspective.
You can see all Time Series Forecasting posts here. Below is a selection of some of the most popular tutorials.
The performance of your predictive model is only as good as the data that you use to train it.
As such data preparation may the most important parts of your applied machine learning project.
Here’s how to get started with Data Preparation for machine learning:
You can see all Data Preparation tutorials here. Below is a selection of some of the most popular tutorials.
It is popular because it is being used by some of the best data scientists in the world to win machine learning
competitions.
You can see all XGBoosts posts here. Below is a selection of some of the most popular tutorials.
Imbalanced Classification
Imbalanced classification refers to classification tasks where there are many more examples for one class than
another class.
These types of problems often require the use of specialized performance metrics and learning algorithms as the
standard metrics and methods are unreliable or fail completely.
You can see all Imbalanced Classification posts here. Below is a selection of some of the most popular tutorials.
State-of-the-art results are coming from the field of deep learning and it is a sub-field of machine learning that
cannot be ignored.
You can see all deep learning posts here. Below is a selection of some of the most popular tutorials.
You can see all PyTorch deep learning posts here. Below is a selection of some of the most popular tutorials.
OpenCV is the most popular library for image processing but its machine learning module is less well-known.
If you are already using OpenCV, adding machine learning to your project should be at no additional cost. You can
make use of the experiences you learned in scikit-learn or Keras to bring your image processing project to the next
level.
Below are the steps that you can use to get started with machine learning in OpenCV:
You can see all OpenCV machine learning posts here. Below is a selection of some of the most popular tutorials.
Although it is easy to define and fit a deep learning neural network model, it can be challenging to get good
performance on a specific predictive modeling problem.
There are standard techniques that you can use to improve the learning, reduce overfitting, and make better
predictions with your deep learning model.
Here’s how to get started with getting better deep learning performance:
You can see all better deep learning posts here. Below is a selection of some of the most popular tutorials.
Ensemble Learning
Predictive performance is the most important concern on many classification and regression problems. Ensemble
learning algorithms combine the predictions from multiple models and are designed to perform better than any
contributing ensemble member.
Here’s how to get started with getting better ensemble learning performance:
You can see all ensemble learning posts here. Below is a selection of some of the most popular tutorials.
Long Short-Term Memory (LSTM) Recurrent Neural Networks are designed for sequence prediction problems and
are a state-of-the-art deep learning technique for challenging prediction problems.
You can see all LSTM posts here. Below is a selection of some of the most popular tutorials using LSTMs in Python
with the Keras deep learning library.
Working with text data is hard because of the messy nature of natural language.
Text is not “solved” but to get state-of-the-art results on challenging NLP problems, you need to adopt deep learning
methods
Here’s how to get started with deep learning for natural language processing:
You can see all deep learning for NLP posts here. Below is a selection of some of the most popular tutorials.
Working with image data is hard because of the gulf between raw pixels and the meaning in the images.
Computer vision is not solved, but to get state-of-the-art results on challenging computer vision tasks like object
detection and face recognition, you need deep learning methods.
Here’s how to get started with deep learning for computer vision:
Step 1: Discover what deep learning for Computer Vision is all about.
What is Computer Vision?
What is the Promise of Deep Learning for Computer Vision?
Step 2: Discover standard tasks and datasets for Computer Vision.
9 Applications of Deep Learning for Computer Vision
How to Load and Visualize Standard Computer Vision Datasets With Keras
How to Develop and Demonstrate Competence With Deep Learning for Computer Vision
Step 3: Discover how to work through problems and deliver results.
How to Get Started With Deep Learning for Computer Vision (7-Day Mini-Course)
Deep Learning for Computer Vision (my book)
You can see all deep learning for Computer Vision posts here. Below is a selection of some of the most popular
tutorials.
Deep learning neural networks are able to automatically learn arbitrary complex mappings from inputs to outputs
and support multiple inputs and outputs.
Methods such as MLPs, CNNs, and LSTMs offer a lot of promise for time series forecasting.
Here’s how to get started with deep learning for time series forecasting:
Step 1: Discover the promise (and limitations) of deep learning for time series.
The Promise of Recurrent Neural Networks for Time Series Forecasting
On the Suitability of Long Short-Term Memory Networks for Time Series Forecasting
Results From Comparing Classical and Machine Learning Methods for Time Series Forecasting
Step 2: Discover how to develop robust baseline and defensible forecasting models.
Taxonomy of Time Series Forecasting Problems
How to Develop a Skillful Machine Learning Time Series Forecasting Model
Step 3: Discover how to build deep learning models for time series forecasting.
How to Get Started with Deep Learning for Time Series Forecasting (7-Day Mini-Course)
Deep Learning for Time Series Forecasting (my book)
You can see all deep learning for time series forecasting posts here. Below is a selection of some of the most
popular tutorials.
Generative Adversarial Networks, or GANs for short, are an approach to generative modeling using deep learning
methods, such as convolutional neural networks.
GANs are an exciting and rapidly changing field, delivering on the promise of generative models in their ability to
generate realistic examples across a range of problem domains, most notably in image-to-image translation tasks.
Here’s how to get started with deep learning for Generative Adversarial Networks:
You can see all Generative Adversarial Network tutorials listed here. Below is a selection of some of the most
popular tutorials.
Attention mechanisms are the techniques invented to mitigate the issue where recurrent neural networks failed to
work well with long sequences of input. We learned that the attention mechanism itself can be used as a building
block of neural networks and therefore we now have the transformer architecture.
Attention mechanisms and transformer models are shown to deliver amazing results, especially in natural language
processing. There are examples of using transformer models in one way or another that make computers
understand human language and perform tasks such as translation or summarizing a paragraph, in human-like
quality.
You can see all Attention and Transformer tutorials listed here. Below is a selection of some of the most popular
tutorials.
If you still have questions and need help, you have some options:
Ebooks: I sell a catalog of Ebooks that show you how to get results with machine learning, fast.
Machine Learning Mastery EBook Catalog
Blog: I write a lot about applied machine learning on the blog, try the search feature.
Machine Learning Mastery Blog
Frequently Asked Questions: The most common questions I get and their answers
Machine Learning Mastery FAQ
Contact: You can contact me with your question, but one question at a time please.
Machine Learning Mastery Contact