DhananjayKumar

Principal consultant and full stack developer with 15+ years of experience in developing, deploying and maintaining enterprise applications.

ML with Python: Part-1

Published Sep 21, 2019

Now, We are comfortable with Python and ready to get started with Machine Learning (ML) projects. But, Where to go next? Can we directly dive into coding ML projects? Please follow along to know the answer.....

You might struggle if you don't understand the basic concepts. So, I would strongly suggest to spend some time understanding common questions -

What is machine Learning?
What all problems we can solve with ML?
Steps involved in Machine Learning projects?
Where ML fits as compared to AI and Deep Learning?

You can get numerous good Posts, Videos to get the answers. After spending few hours digging into posts and videos, you will understand the terms used in ML and the sample app will start making sense. You will understand terms like, Choosing Model, Training Model, Loass Function and Make Prediction etc. And then you will start wondering what is the complete picture of all this.

Here, I am summarizing all the finding I got so far, like what all processes are involved in ML Projects, Type of ML Algorithms, And some popular algorithm against various types, difference between ML and Deep Learning.

On a nutshell, AI, ML and Deep learning can be described as below -

Important Point worth mentioning here !!!

Don't get hung up leaning details of all the algorithms like, how it works under the hood. Basic Idea would be sufficient like in simple linear regression model y=mx+b training finds and best fit value of m and b so that we predict y for the given x value. Instead, you should be knowing what Algorithm/Model might best fit to the problem you are solving.
The “learning” part of machine learning means that ML algorithms attempt to optimize along a certain dimension; i.e. they usually try to minimize error or maximize the likelihood of their predictions being true. This has three names: an error function, a loss function, or an objective function… When someone says they are working with a machine-learning algorithm, you can get to the gist of its value by asking: What’s the objective function?
Deep learning is a subset of machine learning. Usually, when people use the term deep learning, they are referring to deep artificial neural networks. Deep is a technical term. It refers to the number of layers in a neural network. Multiple hidden layers allow deep neural networks to learn features of the data in a so-called feature hierarchy. Computational intensive is one of the hallmarks of deep learning, and it is one reason why a new kind of chip call GPUs are in demand to train deep-learning models. And It's suitable for handling large amount of unstructured data such as blobs of pixels or text.

Process involved in ML

Data Gathering
Data Pre-Processing
Choose Model
Train the Model
Evaluate the Model
Parameter Tuning
Make Predictions

Data Gathering

This step is very important because the quality and quantity of data that you gather will directly determine how good your predictive model can be.

Data Pre-Processing

In this Step we clean the Data Set by removing duplicates, correcting errors, dealing with missing values, normalization, data type conversions, etc. We also randomize data, which erases the effects of the particular order in which we collected and/or otherwise prepared our data. Also visualize data to help detect relevant relationships between variables or class imbalances (bias alert!), or perform other exploratory analysis. And last Split into training and test sets.

Choose a Model

The next step in our workflow is choosing a model. There are many models that researchers and data scientists have created over the years. Some are very well suited for image data, others for sequences (like text, or music), some for numerical data, others for text-based data. And therefore you should have knowledge of all available algorithms and choose the right one for your task.

Train the Model

The goal of training is to answer a question or make a prediction correctly as often as possible. In case of Linear regression example: algorithm would need to learn values for m (or W) and b (x is input, y is output). Each iteration of process is a training step.

Evaluate the Model

Once training is complete, it’s time to see if the model is any good, using Evaluation. This is where that dataset that we set aside earlier comes into play. Evaluation allows us to test our model against data that has never been used for training. A good rule of thumb I use for a training-evaluation split somewhere on the order of 80/20 or 70/30.

Parameter Tuning

Once you’ve done evaluation, it’s possible that you want to see if you can further improve your training in any way. We can do this by tuning our parameters. There were a few parameters we implicitly assumed when we did our training, and now is a good time to go back and test those assumptions and try other values. there are many considerations at this phase of training, and it’s important that you define what makes a model “good enough”, otherwise you might find yourself tweaking parameters for a very long time. These parameters are typically referred to as “hyperparameters”. The adjustment, or tuning, of these hyperparameters, remains a bit of an art, and is more of an experimental process that heavily depends on the specifics of your dataset, model, and training process.

Make Predictions

We can finally use our model to predict.

Type of ML Algorithms

Supervised Learning

Supervised learning is when the model is getting trained on a labelled dataset. Labelled dataset is one which have both input and output parameters. Supervised machine learning includes two major processes: classification and regression.

Classification is the process where incoming data is labeled based on past data samples and manually trains the algorithm to recognize certain types of objects and categorize them accordingly. Examples -

Fraud Detection
Email Spam Detection
Image Classification

Regression is the process of identifying patterns and calculating the predictions of continuous outcomes. The system has to understand the numbers, their values, grouping. Examples -

Whether Forecasting
Risk Assessment
Score Prediction

Popular Supervised Learning Algorithms:

Liner Regression
Logistic Regression
Decision Tree
Random Forest
KNN (K Nearest Neighbor)
Support Vector Machine

Unsupervised Learning

in case of unsupervised machine learning algorithms the desired results are unknown and yet to be defined. Unsupervised learning problems further grouped into clustering and association problems.

Clustering is an important concept when it comes to unsupervised learning. It mainly deals with finding a structure or pattern in a collection of uncategorized data. Clustering algorithms will process your data and find natural clusters(groups) if they exist in the data. You can also modify how many clusters your algorithms should identify. It allows you to adjust the granularity of these groups. Examples -

Medical Research
City Planning
Targeted Marketing

Association rules allow you to establish associations amongst data objects inside large databases. This unsupervised technique is about discovering interesting relationships between variables in large databases. For example, people that buy a new home most likely to buy new furniture. Examples -

Market Basket Analysis
Text Mining
Face Recognition

Popular Unsupervised Learning Algorithms:

K Means Clustering
Hierarchical Clustering
Apriori Algorithm

Reinforcement Learning

It is about taking suitable action to maximize reward in a particular situation. It is employed by various software and machines to find the best possible behavior or path it should take in a specific situation. Reinforcement learning differs from the supervised learning in a way that in supervised learning the training data has the answer key with it so the model is trained with the correct answer itself whereas in reinforcement learning, there is no answer but the reinforcement agent decides what to do to perform the given task. In the absence of training dataset, it is bound to learn from its experience. Examples -

Gaming
Robot Navigation
Stock Trading
Assembly Line Processes

Popular Reinforcement Learning Algorithms:

Q-learning
SARSA
DQN
DDPG

Conclusion

In Part-1 of "ML with Python" series we summerized ML concepts, it's type and popular algorithms. We will cover popular ML Alorithms with example and implementation using Python in subsequent posts.

Machine learning Python

Report

Enjoy this post? Give DhananjayKumar a like if it's helpful.

DhananjayKumar

Principal consultant and full stack developer with 15+ years of experience in developing, deploying and maintaining enterprise applications.

Principal consultant and full stack developer with 15+ years of experience in developing, deploying and maintaining enterprise applications. Skillset summary- • Language : C#, Javascript, Typescript, Python, HTML, CSS • D...

Discover and read more posts from DhananjayKumar

get started