GLOBAL RESEARCH SYNDICATE
No Result
View All Result
  • Login
  • Latest News
  • Consumer Research
  • Survey Research
  • Marketing Research
  • Industry Research
  • Data Collection
  • More
    • Data Analysis
    • Market Insights
  • Latest News
  • Consumer Research
  • Survey Research
  • Marketing Research
  • Industry Research
  • Data Collection
  • More
    • Data Analysis
    • Market Insights
No Result
View All Result
globalresearchsyndicate
No Result
View All Result
Home Data Analysis

Hands-on Linear Regression Using Sklearn

globalresearchsyndicate by globalresearchsyndicate
November 6, 2020
in Data Analysis
0
Hands-on Linear Regression Using Sklearn
0
SHARES
16
VIEWS
Share on FacebookShare on Twitter

W3Schools


In today’s article, we will be taking a look at how to predict the rating of cereals. The problem statement is to predict the cereal ratings where the columns give the exact figures of the ingredients. Link to the data set is mentioned below. 

We will be making the data ready to go and will fit a simple model into it and would also regularise the data to see how good the model can become.

#import necessary libraries



import pandas as pd
import numpy as np

Now you can download the dataset from here.

It is advised to read the description of the dataset before proceeding, will help you comprehend the problem better.

Extract the data and enter the file path of csv file in it.

df=pd.read_csv('D:Data Setscereal.csv') #reading the file
df.head() #for printing the first five rows of the dataset

Output

Here since we see that rating column is a continuous data thus it is a regression problem. 

#dropping the rows that are redundant
data=df.drop(['name'],axis=1)
#to see if there’s any missing data
data.isnull().sum() #no missing values
#encoding the data
from sklearn.preprocessing import LabelEncoder
le=LabelEncoder()
#label encoding the first two rows
 for i in range(2):
   x[:,i]=le.fit_transform(x[:,i])

Output

from scipy.stats import pearsonr
corelation=[]
for i in range(len(data.columns)-1):
  col_x=x[:,i]
  col_y=y
  corr,_=pearsonr(col_x,col_y)
  corelation.append(corr)
  print(corr)

Taking the index values of those whose correlation is greater than 0.29 or less than -0.29

If you don’t know what is correlation then you can study it from here.

drop_col=[]
#dropping the columns whose index is the there in the given condition
for i in index:
  data.columns[i]
  #print(data.columns[i])
  drop_col.append(data.columns[i])

Now the independent variable.

x=data.iloc[:,:-1].values
#Splitting the dataset
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.2)
Here the test size is 0.2 and train size is 0.8. 
from sklearn.linear_model import LinearRegression
regressor=LinearRegression()
regressor.fit(x_train,y_train)
regressor.score(x_test,y_test) #no regularization 

Output

See Also


0.9943613024056396

It is way too high and is overfitted so we will regularize it.

You can read about regularisation from here.

y_pred=regressor.predict(x_test)
#regularizing the linear model
from sklearn.linear_model import Ridge
ridge_reg_1=Ridge(alpha=1,normalize=True)
ridge_reg_1.fit(x_train,y_train)
ridge_reg_1.score(x_test,y_test)   #alpha =1
ridge_reg_05=Ridge(alpha=0.5,normalize=True)
ridge_reg_05.fit(x_train,y_train)
ridge_reg_05.score(x_test,y_test)   #alpha =0.5
ridge_reg_2=Ridge(alpha=2,normalize=True)
ridge_reg_2.fit(x_train,y_train)
ridge_reg_2.score(x_test,y_test)    #alpha =2

Output

Conclusion

This article was aimed to discuss the problem statement of cereal rating. We had a look at different things including making the data ready for training where we had label encoded our data columns. Not only that but we trained the data using linear regression and then also had regularised it. To tweak and understand it better you can also try different algorithms on the same problem, with that you would not only get better results but also a better understanding of the same.

Hope you liked the article.


If you loved this story, do join our Telegram Community.


Also, you can write for us and be one of the 500+ experts who have contributed stories at AIM. Share your nominations here.

Bhavishya Pandit

Bhavishya Pandit


Understanding and building fathomable approaches to problem statements is what I like the most. I love talking about conversations whose main plot is machine learning, computer vision, deep learning, data analysis and visualization.

Apart from them, my interest also lies in listening to business podcasts, use cases and reading self help books.

Related Posts

How Machine Learning has impacted Consumer Behaviour and Analysis
Consumer Research

How Machine Learning has impacted Consumer Behaviour and Analysis

January 4, 2024
Market Research The Ultimate Weapon for Business Success
Consumer Research

Market Research: The Ultimate Weapon for Business Success

June 22, 2023
Unveiling the Hidden Power of Market Research A Game Changer
Consumer Research

Unveiling the Hidden Power of Market Research: A Game Changer

June 2, 2023
7 Secrets of Market Research Gurus That Will Blow Your Mind
Consumer Research

7 Secrets of Market Research Gurus That Will Blow Your Mind

May 8, 2023
The Shocking Truth About Market Research Revealed!
Consumer Research

The Shocking Truth About Market Research: Revealed!

April 25, 2023
market research, primary research, secondary research, market research trends, market research news,
Consumer Research

Quantitative vs. Qualitative Research. How to choose the Right Research Method for Your Business Needs

March 14, 2023
Next Post
Global Kudzu Root P.E. Market Research Report Covers, Future Trends, Past, Present Data and Deep Analysis – TechnoWeekly

Global Kudzu Root P.E. Market Research Report Covers, Future Trends, Past, Present Data and Deep Analysis – TechnoWeekly

Categories

  • Consumer Research
  • Data Analysis
  • Data Collection
  • Industry Research
  • Latest News
  • Market Insights
  • Marketing Research
  • Survey Research
  • Uncategorized

Recent Posts

  • Ipsos Revolutionizes the Global Market Research Landscape
  • How Machine Learning has impacted Consumer Behaviour and Analysis
  • Market Research: The Ultimate Weapon for Business Success
  • Privacy Policy
  • Terms of Use
  • Antispam
  • DMCA

Copyright © 2024 Globalresearchsyndicate.com

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Cookie settingsACCEPT
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT
No Result
View All Result
  • Latest News
  • Consumer Research
  • Survey Research
  • Marketing Research
  • Industry Research
  • Data Collection
  • More
    • Data Analysis
    • Market Insights

Copyright © 2024 Globalresearchsyndicate.com