GLOBAL RESEARCH SYNDICATE
No Result
View All Result
  • Login
  • Latest News
  • Consumer Research
  • Survey Research
  • Marketing Research
  • Industry Research
  • Data Collection
  • More
    • Data Analysis
    • Market Insights
  • Latest News
  • Consumer Research
  • Survey Research
  • Marketing Research
  • Industry Research
  • Data Collection
  • More
    • Data Analysis
    • Market Insights
No Result
View All Result
globalresearchsyndicate
No Result
View All Result
Home Data Analysis

Easy Step-by-step Tutorial for Beginners

globalresearchsyndicate by globalresearchsyndicate
January 6, 2020
in Data Analysis
0
Easy Step-by-step Tutorial for Beginners
0
SHARES
2
VIEWS
Share on FacebookShare on Twitter

  • Machine Learning
  • Python Machine Learning: Easy Step-by-step Tutorial for Beginners

by: Juliet | 6 January 2020

Photo by Shahadat Rahman

In the modern world, machine learning has become an essential aspect of success both in business and research. It has an algorithm that automates every business process. In this article, we will learn about Machine Learning and we will explore different algorithms, applications, and usage of Python programming language.

For beginners, First, let’s begin with the theoretical background of Machine Learning. Machine learning is not new in computing. Instead, it is one of the fastest technology that evolves over time. In 1943, Neural networks were first introduced by mathematician Walter Pitts and Neurophysiologist Warren McCulloh. After that, in 1958, Perceptron was created as the first Artificial Neural Network. It was created by Frank Rosenblatt to recognize patterns and shapes. Fast forward, so finally, in the 21st century with the leadership of big players such as Google, Amazon, Facebook and others. There are more and more algorithms, research and development in this field. Thus, this encourages more and more innovation and invention in the field of computing.

 

What is Machine Learning?

 

Machine learning is one of the hot topics in Artificial Intelligence where it provides computers the competence to learn without being explicitly programmed. It also has an algorithm that allows applications to have more accurate results. In addition, in predicting outcomes, it provides accurate results without being obviously programmed. Machine learning usually builds algorithms that have input data and it use statistical analysis. It is used to predict an outcome while updating outputs as new data.  Methods involved in machine learning are related to data mining and predictive modeling. The process requires searching data to find patterns and regulating actions accordingly.

One great example is when you’re using your social media accounts where it learns your browsing experience. You can notice this through the ads it appears on your account. Recommendation engines use machine learning to personalize online ads in real-time aspects. Aside from search engine recommendation, machine learning also uses for spam filtering, network detection threat and predictive maintenance. 

 

Algorithms in Machine Learning

 

Talking about an algorithm, Machine learning has a lot of algorithms to explore.

 

These algorithms categorize into groups:

 

  • Supervised Learning
  • Unsupervised Learning
  • Semi-supervised Learning
  • Reinforcement Learning

 

When we talked about Supervised Learning, this means that the computer has the ability to recognize data based on the provided samples. A computer learns and improves the ability to understand new data based on the original data.

 

Algorithms under supervised learning include the following:

 

  • Linear Regression
  • Support-vector Machine(SVM)
  • Decision Trees
  • Naïve Bayes classifier
  • K-Nearest Neighbor

 

Unsupervised Learning works in a way that a computer program trains with unlabeled data. The machine is the one to determine the relationship between input data and other relevant data. This indicates that the computer itself looks for patterns and relationships between the data sets. In this area of machine learning, this is very useful in pattern detection and descriptive modeling. When in cases where a human doesn’t have an idea what to look for in the data set. This algorithm doesn’t determine the right output but it explores the data. After, it can categorize inferences from datasets to describe hidden data from unlabeled data.

 

Under supervised learning are the following algorithms:

 

  • K-means Clustering
  • Association Rules

 

Due to the limitations of both supervised and unsupervised learning, Semi-supervised learning has found its way to these limitations. In other words, semi-supervised Learning descends from both supervised and unsupervised learning. For some instances, labeling data might cost high since it needs the skills of the experts. Therefore, semi-supervised learning can use as unlabeled data for training. Usually, this type of machine learning involves a small amount of labeled data and it has a large amount of unlabeled data.

 

Algorithms under semi-supervised learning are the following:

 

  • Continuity Assumption
  • Cluster Assumption
  • Manifold Assumption

 

On the other hand, Reinforcement Learning is a method that allows interaction with its environment. In other words, it interacts by creating actions and discovers errors. This learning typically has a characteristic of a trial and error search and reward delays. Furthermore, this allows computers and software applications to automatically determine the behavior within a specific context to maximize its performance. Some reinforcement learning applications include robotic hands, self-driving cars, and other computer-related games.

 

List of the algorithm under reinforcement learning are the following:

 

  • Temporal Difference(TD)
  • Q-Learning
  • Deep Adversarial Networks

 

How to start Machine learning in Python?

 

Python is one of the best programming languages to use in machine learning applications. Before we start with deeper discussions on Python, first you need to install the following tools in order to get started. Make sure to install the latest version.

 

List of tools

 

  • Python Programming LanguageTensorFlow
  • Keras
  • Shogun
  • Scikit-Learn
  • Theano
  • NLTK
  • Numpy
  • Pandas
  • Anaconda

 

In my own preference, I only installed anaconda on my machine which is the most basic and relevant framework and IDE’s are already included. Here’s how it looks like upon opening your anaconda installed.

 

© Screenshot of Anaconda Navigator from the Application

 

Anaconda gives you options on what kind of IDE you are going to use. In my case, I like using the Spyder IDE than others. Here’s a sample screenshot of Spyder IDE running a simple “hello world” in python.

 

© Screenshot of Code Editor from Spider IDE Python 3.7

 

Python’s Basic Syntax

 

After installing and setting up all the needed tools, Let’s start with some basics! In machine learning, there are a series of steps and processes to follow. This article will only focus on the mathematical aspect and data visualization using python. In my example below, I will be using a supervised learning algorithm.

 

DATA

 

In data science or any machine learning program, data is an essential and important ingredient in this topic. For beginners that want to explore the world of data science, you can download data from different databank or data repository. On my own preference, I downloaded some of my data from www.kaggle.com. Another option to have data is to personally collect the data needed for learning. In my example today, I downloaded data from kaggle regarding, mental health. These data are reported suicide rates from 1985 to 2016.

 

LIBRARIES

 

Python has a lot of libraries to offer for machine learning. In my sample code, I’m using the following libraries.

 

  • Numpy is one of the primary packages in Python used for scientific computation.
  • Seaboarn is a Python library used for visualizing data based on matplotib. This library provides a high-level interface for good looking and attractive graphical charts and statistical analysis.
  • Panda is an open-source library in Python that provide high performance, easy to use data structures and data analysis tools.
  • Matplotlib is a 2D plotting library in Python.

 

Using the function of these libraries it must be the first call out in your code. Here’s a good start to import these libraries

 

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import os

 

LOADING DATASETS

 

In python, if your dataset is in CSV file format, it reads using predefined functions. Reading and displaying data from CSV file is as follows:

 

curDir = os.getcwd()
data=pd.read_csv(curDir + '/master.csv')
print(data.head())
print(data.tail())

print(data.sample(5))
print(data.sample(frac=0.1))

 

© Screenshot of Dataset in Python

 

Another way of showing your data or displaying your data is as follows:

 

print(data.iloc[:,1:5].describe())
print(data.info())
print(data.columns)
#renaming data.
data=data.rename(columns={'country':'Country','year':'Year','sex':'Gender','age':'Age','suicides_no':'SuicidesNo','population':'Population','suicides/100k pop':'Suicides 100k Population','country-year':'CountryYear','HDI for year':'HDIForYear',' gdp_for_year ($) ':'GdpForYearMoney','gdp_per_capita ($)':'Gdp_Per_Capita','generation':'Generation'})
print(data.columns)
print(data.isnull().any())
print(data.isnull().values.any())
print(data.isnull().sum())

 

Simple Linear Regression Algorithm

 

In this example, I’m using a simple linear regression between GDP per Capital and Population. To begin with, as you can see in the graph, Gender was also included in the analysis. Below is the code implemented in Python.

 

sns.lmplot(x="Population", y="Gdp_Per_Capita", hue="Gender", data=data, palette="Set1")
plt.show()

© Screenshot of Linear Regression Algorithm in Python

 

DATA VISUALIZATION

 

There are so many ways you can show your data but as for my own preference, showing relevant data in a form of graph would be best emphasize. Python has a good way to represent it, so here’s how!

First, the graph represents the population of suicide rates in this presentation, it uses a simple scatterplot graph. Here’s how to implement scatterplot in python.

 

plt.figure(figsize=(10,10))

sns.stripplot(x="Year",y='Suicides 100k Population',data=data, palette="Set1")

plt.xticks(rotation=45)

plt.show()

© Screenshot of Scatterplot Graph in Python

 

Second and finally the last graph have the same values to represent but for educational purposes, It represents both Pie graph and Bar graph. Check how python code differs in these data.

 

f,ax=plt.subplots(1,2,figsize=(18,8))
data['Generation'].value_counts().plot.pie(explode=[0.1,0.1,0.1,0.1,0.1,0.1],autopct='%1.1f%%',ax=ax[0],shadow=True)

© screenshot of Pie graph in Python

 

f,ax=plt.subplots(1,2,figsize=(18,8))
ax[0].set_title('Generations Summary')
ax[0].set_ylabel('Count')
sns.countplot('Generation',data=data,ax=ax[1], palette="Set1")
ax[1].set_title('Generations Count')
plt.show()

 

© Screenshot of Pie graph in Python

 

Here’s the entire code implementation in Python:

 

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Mon Nov 4 13:42:48 2019

@author: juliet mendez

- Mental Health Evaluation
- Data from www.kaggle.com

"""

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import os

curDir = os.getcwd()
data=pd.read_csv(curDir + '/master.csv')
print(data.head())
print(data.tail())

print(data.sample(5))
print(data.sample(frac=0.1))

data=data.rename(columns={'country':'Country','year':'Year','sex':'Gender','age':'Age','suicides_no':'SuicidesNo','population':'Population','suicides/100k pop':'Suicides 100k Population','country-year':'CountryYear','HDI for year':'HDIForYear',' gdp_for_year ($) ':'GdpForYearMoney','gdp_per_capita ($)':'Gdp_Per_Capita','generation':'Generation'})

print(data.isnull().any())
print(data.isnull().values.any())

print(data.isnull().sum())

print("n n n")

sns.lmplot(x="Population", y="Gdp_Per_Capita", hue="Gender", data=data, palette="Set1")
plt.show()
print("n n n")
plt.figure(figsize=(10,10))
sns.stripplot(x="Year",y='Suicides 100k Population',data=data, palette="Set1")
plt.xticks(rotation=45)
plt.show()
print("n n n")
data=data.drop(['HDIForYear','CountryYear'],axis=1)

min_year=min(data.Year)
max_year=max(data.Year)
print('Min Year :',min_year)
print('Max Year :',max_year)

data_country=data[(data['Year']==min_year)]

print("n n n")
f,ax=plt.subplots(1,2,figsize=(18,8))
#pie chart
data['Generation'].value_counts().plot.pie(explode=[0.1,0.1,0.1,0.1,0.1,0.1],autopct='%1.1f%%',ax=ax[0],shadow=True)

#bar graph
ax[0].set_title('Generations Summary')
ax[0].set_ylabel('Count')
sns.countplot('Generation',data=data,ax=ax[1], palette="Set1")
ax[1].set_title('Generations Count')
plt.show() 

Related Posts

How Machine Learning has impacted Consumer Behaviour and Analysis
Consumer Research

How Machine Learning has impacted Consumer Behaviour and Analysis

January 4, 2024
Market Research The Ultimate Weapon for Business Success
Consumer Research

Market Research: The Ultimate Weapon for Business Success

June 22, 2023
Unveiling the Hidden Power of Market Research A Game Changer
Consumer Research

Unveiling the Hidden Power of Market Research: A Game Changer

June 2, 2023
7 Secrets of Market Research Gurus That Will Blow Your Mind
Consumer Research

7 Secrets of Market Research Gurus That Will Blow Your Mind

May 8, 2023
The Shocking Truth About Market Research Revealed!
Consumer Research

The Shocking Truth About Market Research: Revealed!

April 25, 2023
market research, primary research, secondary research, market research trends, market research news,
Consumer Research

Quantitative vs. Qualitative Research. How to choose the Right Research Method for Your Business Needs

March 14, 2023
Next Post
content will drive connection and authenticity

content will drive connection and authenticity

Categories

  • Consumer Research
  • Data Analysis
  • Data Collection
  • Industry Research
  • Latest News
  • Market Insights
  • Marketing Research
  • Survey Research
  • Uncategorized

Recent Posts

  • Ipsos Revolutionizes the Global Market Research Landscape
  • How Machine Learning has impacted Consumer Behaviour and Analysis
  • Market Research: The Ultimate Weapon for Business Success
  • Privacy Policy
  • Terms of Use
  • Antispam
  • DMCA

Copyright © 2024 Globalresearchsyndicate.com

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Cookie settingsACCEPT
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT
No Result
View All Result
  • Latest News
  • Consumer Research
  • Survey Research
  • Marketing Research
  • Industry Research
  • Data Collection
  • More
    • Data Analysis
    • Market Insights

Copyright © 2024 Globalresearchsyndicate.com