GLOBAL RESEARCH SYNDICATE
No Result
View All Result
  • Login
  • Latest News
  • Consumer Research
  • Survey Research
  • Marketing Research
  • Industry Research
  • Data Collection
  • More
    • Data Analysis
    • Market Insights
  • Latest News
  • Consumer Research
  • Survey Research
  • Marketing Research
  • Industry Research
  • Data Collection
  • More
    • Data Analysis
    • Market Insights
No Result
View All Result
globalresearchsyndicate
No Result
View All Result
Home Data Analysis

A Complete Tutorial On Implementing Lasso Regression In Python With MachineHack Data Science Hackathon

globalresearchsyndicate by globalresearchsyndicate
November 29, 2019
in Data Analysis
0
A Complete Tutorial On Implementing Lasso Regression In Python With MachineHack Data Science Hackathon
0
SHARES
6
VIEWS
Share on FacebookShare on Twitter


When we talk about Machine Learning or Data Science or any process that involves predictive analysis using data — regression, overfitting and regularization are terms that are often used. Understanding regularization and the methods to regularize can have a big impact on a Predictive Model in producing reliable and low variance predictions.



In this article, we will learn to implement one of the key regularization techniques in Machine Learning using scikit learn and python.

What is Regularization?

Overfitting is one of the most annoying things about a Machine Learning model. After all those time-consuming processes that took to gather the data, clean and preprocess it, the model is still incapable to give out an optimised result.  There can be lots of noises in data which may be the variance in the target variable for the same and exact predictors or irrelevant features or it can be corrupted data points. The ML model is unable to identify the noises and hence uses them as well to train the model. This can have a negative impact on the predictions of the model. This is called overfitting.


W3Schools


In simple words, overfitting is the result of an ML model trying to fit everything that it gets from the data including noises.

Why regularization?

Regularization is intended to tackle the problem of overfitting. Overfitting becomes a clear menace when there is a large dataset with thousands of features and records. Ridge regression and Lasso regression are two popular techniques that make use of regularization for predicting.

Both the techniques work by penalising the magnitude of coefficients of features along with minimizing the error between predictions and actual values or records. The key difference however, between Ridge and Lasso regression is that Lasso Regression has the ability to nullify the impact of an irrelevant feature in the data, meaning that it can reduce the coefficient of a feature to zero thus completely eliminating it and hence is better at reducing the variance when the data consists of many insignificant features. Ridge regression, however, can not reduce the coefficients to absolute zero. Ridge regression performs better when the data consists of features which are sure to be more relevant and useful.

Lasso Regression

Lasso stands for Least Absolute Shrinkage and Selection Operator. Let us have a look at what Lasso regression means mathematically:

Residual Sum of Squares + λ * (Sum of the absolute value of the magnitude of coefficients)

Where,

  • λ denotes the amount of shrinkage
  • λ = 0 implies all features are considered and it is equivalent to the linear regression where only the residual sum of squares are considered to build a predictive model
  • λ = ∞ implies no feature is considered i.e, as λ closes to infinity it eliminates more and more features
  • The bias increases with increase in λ
  • variance increases with decrease in λ

Implementing Lasso Regression In Python

For this example code, we will consider a dataset from Machinehack’s Predicting Restaurant Food Cost Hackathon.

Consider going through the following article to help you with Data Cleaning and Preprocessing:

A Complete Guide to Cracking The Predicting Restaurant Food Cost Hackathon By MachineHack

After completing all the steps till Feature Scaling(Excluding) we can proceed to building a Lasso regression. We are avoiding feature scaling as the lasso regressor comes with a parameter that allows us to normalise the data while fitting it to the model.

Lets Code!

import numpy as np

Creating a New Train and Validation Datasets

from sklearn.model_selection import train_test_split
data_train, data_val = train_test_split(new_data_train, test_size = 0.2, random_state = 2)

Classifying Predictors and Target

#Classifying Independent and Dependent Features
#_______________________________________________
#Dependent Variable
Y_train = data_train.iloc[:, -1].values
#Independent Variables
X_train = data_train.iloc[:,0 : -1].values
#Independent Variables for Test Set
X_test = data_val.iloc[:,0 : -1].values

Evaluating The Model With RMLSE

def score(y_pred, y_true):
error = np.square(np.log10(y_pred +1) - np.log10(y_true +1)).mean() ** 0.5
score = 1 - error
return score

See Also


actual_cost = list(data_val['COST'])
actual_cost = np.asarray(actual_cost)

Building the Lasso Regressor

######################################################################
#Lasso Regression
############################################################################
from sklearn.linear_model import Lasso

#Initializing the Lasso Regressor with Normalization Factor as True
lasso_reg = Lasso(normalize=True)

#Fitting the Training data to the Lasso regressor
lasso_reg.fit(X_train,Y_train)

#Predicting for X_test
y_pred_lass =lasso_reg.predict(X_test)

#Printing the Score with RMLSE
print("nnLasso SCORE : ", score(y_pred_lass, actual_cost))

Output:

0.7335508027883148

The Lasso Regression attained an accuracy of 73% with the given Dataset

Also, check out the following resources to help you more with this problem:

  1. Guide To Implement StackingCVRegressor In Python With MachineHack’s Predicting Restaurant Food Cost Hackathon
  2. Model Selection With K-fold Cross Validation — A Walkthrough with MachineHack’s Food Cost Prediction Hackathon
  3. Flight Ticket Price Prediction Hackathon: Use These Resources To Crack Our MachineHack Data Science Challenge
  4. Hands-on Tutorial On Data Pre-processing In Python
  5. Data Preprocessing With R: Hands-On Tutorial
  6. Getting started with Linear regression Models in R
  7. How To Create Your first Artificial Neural Network In Python
  8. Getting started with Non Linear regression Models in R
  9. Beginners Guide To Creating Artificial Neural Networks In R

Enjoyed this story? Join our Telegram group. And be part of an engaging community.


FEATURED VIDEO

Provide your comments below

comments

Amal Nair

Amal Nair

A Computer Science Engineer who is passionate about AI and all related technologies. He is someone who loves to stay updated with the Tech-revolutions that AI brings in.
Contact: [email protected]

Related Posts

How Machine Learning has impacted Consumer Behaviour and Analysis
Consumer Research

How Machine Learning has impacted Consumer Behaviour and Analysis

January 4, 2024
Market Research The Ultimate Weapon for Business Success
Consumer Research

Market Research: The Ultimate Weapon for Business Success

June 22, 2023
Unveiling the Hidden Power of Market Research A Game Changer
Consumer Research

Unveiling the Hidden Power of Market Research: A Game Changer

June 2, 2023
7 Secrets of Market Research Gurus That Will Blow Your Mind
Consumer Research

7 Secrets of Market Research Gurus That Will Blow Your Mind

May 8, 2023
The Shocking Truth About Market Research Revealed!
Consumer Research

The Shocking Truth About Market Research: Revealed!

April 25, 2023
market research, primary research, secondary research, market research trends, market research news,
Consumer Research

Quantitative vs. Qualitative Research. How to choose the Right Research Method for Your Business Needs

March 14, 2023
Next Post
Global Data Sockets Market Research Report 2019 – Doug Mockett, KOMTECH Kommunikationstechnik, Simon, Z.S.E. Ospel, MELJAC

Global Data Sockets Market Research Report 2019 – Doug Mockett, KOMTECH Kommunikationstechnik, Simon, Z.S.E. Ospel, MELJAC

Categories

  • Consumer Research
  • Data Analysis
  • Data Collection
  • Industry Research
  • Latest News
  • Market Insights
  • Marketing Research
  • Survey Research
  • Uncategorized

Recent Posts

  • Ipsos Revolutionizes the Global Market Research Landscape
  • How Machine Learning has impacted Consumer Behaviour and Analysis
  • Market Research: The Ultimate Weapon for Business Success
  • Privacy Policy
  • Terms of Use
  • Antispam
  • DMCA

Copyright © 2024 Globalresearchsyndicate.com

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Cookie settingsACCEPT
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT
No Result
View All Result
  • Latest News
  • Consumer Research
  • Survey Research
  • Marketing Research
  • Industry Research
  • Data Collection
  • More
    • Data Analysis
    • Market Insights

Copyright © 2024 Globalresearchsyndicate.com