Gradient descent is the most popular optimization algorithm, used in machine learning and deep learning. Gradient descent is iterative optimization algorithm for finding the local minima. To find local minima using gradient descent, one takes steps proportional to the negative of the gradient of the function at the current point.

Here in this tutorial will use Gradient descent optimization algorithm. In our example we have data in csv format with columns “height weight age projects salary”. Assuming there is a correlation between projects and salary will try predict salary given projetcs completed. You download data using this link : “https://drive.google.com/file/d/1Gx0riTlJHt9o_VyokrKNbj384AhwXpAW/view?usp=sharing”

## Initial Setup

First and foremost, we need to load the necessary libraries.

```
from __future__ import print_function
import math ##For basic mathematical operations
from IPython import display ## Plot setup for Ipython
from matplotlib import cm ## Colormap reference
from matplotlib import gridspec ##plot setups
from matplotlib import pyplot as plt ##plot setups
import numpy as np
import pandas as pd
from sklearn import metrics
import tensorflow as tf
from tensorflow.python.data import Dataset
from google.colab import drive ## Loading data directly from Google Drive
drive.mount('/content/gdrive') ## Mounting drive
tf.logging.set_verbosity(tf.logging.ERROR)
pd.options.display.max_rows = 10
pd.options.display.float_format = '{:.1f}'.format
```

## Loading Dataset

Load data-set as pandas dataframe and check stats.

```
dataframe = pd.read_csv("/content/gdrive/My Drive/Colab Notebooks/TENSOR_FLOW/train_dataset.csv", sep=",")
dataframe.head()
```

height | weight | age | projects | salary | |

0 | -114.3 | 34.2 | 15 | 1015 | 66900 |

1 | -114.5 | 34.4 | 19 | 1129 | 80100 |

2 | -114.6 | 33.7 | 17 | 333 | 85700 |

3 | -114.6 | 33.6 | 14 | 515 | 73400 |

4 | -114.6 | 33.6 | 20 | 624 | 65500 |

```
dataframe.describe()
height weight age projects salary
count 17000.0 17000.0 17000.0 17000.0 17000.0
mean -119.6 35.6 28.6 1429.6 207300.9
std 2.0 2.1 12.6 1147.9 115983.8
min -124.3 32.5 1.0 3.0 14999.0
25% -121.8 33.9 18.0 790.0 119400.0
50% -118.5 34.2 29.0 1167.0 180400.0
75% -118.0 37.7 37.0 1721.0 265000.0
max -114.3 42.0 52.0 35682.0 500001.0
```

```
dataframe = dataframe.reindex(np.random.permutation(dataframe.index))
dataframe["salary"] /= 1000.0
dataframe.head()
```

`height weight age projects salary`

11381 -121.2 38.9 19 1206 192.6

4865 -118.1 34.1 50 636 500.0

3442 -117.9 33.8 35 1435 200.8

14934 -122.2 37.8 52 409 189.6

14925 -122.2 37.8 52 1659 107.9

## Build our First Model

We wish to predict **Salary**, which will be our label. We’ll use **projects** as our input feature. To train our model, we’ll use the **LinearRegressor**** **interface provided by the TensorFlow Estimator API. This API takes care of a lot of the low-level model fixing and exposes convenient methods for performing model training, evaluation, and inference.

### Step 1: Define Features and Configure Feature Columns

In TensorFlow, we indicate a feature’s data type using a construct called a feature column. Feature columns store only a description of the feature data; they do not contain the feature data itself.

To start, we’re going to use just one numeric input feature, **projects**.

```
my_feature = dataframe[["projects"]]
feature_columns = [tf.feature_column.numeric_column("projects")]
feature_columns
[NumericColumn(key='projects', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None)]
```

### Step 2: Define the Target

Next, we’ll define our target, which is`salary`

Again, we can pull it from our `dataframe`

:

```
targets = dataframe["salary"]
targets
```

11381 192.6

4865 500.0

3442 200.8

14934 189.6

14925 107.9

…

7869 269.2

3770 192.9

11859 194.6

10158 167.7

14422 500.0

Name: salary, Length: 17000, dtype: float64

### Step 3: Configure the **LinearRegressor**

Next, we’ll configure a linear regression model using **LinearRegressor**. We’ll train this model using the GradientDescentOptimizer, which implements Mini-Batch Stochastic Gradient Descent (SGD). The learning_rate argument controls the size of the gradient step.

```
my_optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.0000001)
my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)
linear_regressor = tf.estimator.LinearRegressor(
feature_columns=feature_columns,
optimizer=my_optimizer
)
```

### Step 4: Define the Input Function

To import our salary data into our LinearRegressor, we need to define an input function, which instructs TensorFlow how to preprocess the data, as well as how to batch, shuffle, and repeat it during model training.

First, we’ll convert our pandas feature data into a dict of NumPy arrays. We can then use the TensorFlow Dataset API to construct a dataset object from our data, and then break our data into batches of batch_size, to be repeated for the specified number of epochs (num_epochs).

NOTE: When the default value of num_epochs=None is passed to repeat(), the input data will be repeated indefinitely.

Next, if shuffle is set to True, we’ll shuffle the data so that it’s passed to the model randomly during training. The buffer_size argument specifies the size of the dataset from which shuffle will randomly sample.

Finally, our input function constructs an iterator for the dataset and returns the next batch of data to the **LinearRegressor**.

```
def my_input_fn(features, targets, batch_size=1, shuffle=True, num_epochs=None):
# Convert pandas data into a dict of np arrays.
features = {key:np.array(value) for key,value in dict(features).items()}
# Construct a dataset, and configure batching/repeating.
ds = Dataset.from_tensor_slices((features,targets)) # warning: 2GB limit
ds = ds.batch(batch_size).repeat(num_epochs)
# Shuffle the data, if specified.
if shuffle:
ds = ds.shuffle(buffer_size=10000)
# Return the next batch of data.
features, labels = ds.make_one_shot_iterator().get_next()
return features, labels
```

### Step 5: Train the Model

We can now call train() on our linear_regressor to train the model. We’ll wrap my_input_fn in a lambda so we can pass in my_feature and target as arguments (see this TensorFlow input function tutorial for more details), and to start, we’ll train for 100 steps.

_ = linear_regressor.train(

input_fn = lambda:my_input_fn(my_feature, targets),

steps=100

)

### Tweak the Model Hyperparameters and optimize model

For this exercise, we’ve put all the above code in a single function for convenience. You can call the function with different parameters to see the effect.

In this function, we’ll proceed in 10 evenly divided periods so that we can observe the model improvement at each period.

For each period, we’ll compute and graph training loss. This may help you judge when a model is converged, or if it needs more iterations.

We’ll also plot the feature weight and bias term values learned by the model over time. This is another way to see how things converge.

```
def train_model(learning_rate, steps, batch_size, input_feature="projects"):
periods = 10
steps_per_period = steps / periods
my_feature = input_feature
my_feature_data = dataframe[[my_feature]]
my_label = "salary"
targets = dataframe[my_label]
# feature columns.
feature_columns = [tf.feature_column.numeric_column(my_feature)]
# input functions.
training_input_fn = lambda:my_input_fn(my_feature_data, targets, batch_size=batch_size)
prediction_input_fn = lambda: my_input_fn(my_feature_data, targets, num_epochs=1, shuffle=False)
# linear regressor object.
my_optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)
linear_regressor = tf.estimator.LinearRegressor(
feature_columns=feature_columns,
optimizer=my_optimizer
)
# plot
plt.figure(figsize=(15, 6))
plt.subplot(1, 2, 1)
plt.title("Learned Line by Period")
plt.ylabel(my_label)
plt.xlabel(my_feature)
sample = dataframe.sample(n=300)
plt.scatter(sample[my_feature], sample[my_label])
colors = [cm.coolwarm(x) for x in np.linspace(-1, 1, periods)]
# Training
print("Training model...")
print("RMSE (on training data):")
root_mean_squared_errors = []
for period in range (0, periods):
linear_regressor.train(
input_fn=training_input_fn,
steps=steps_per_period
)
predictions = linear_regressor.predict(input_fn=prediction_input_fn)
predictions = np.array([item['predictions'][0] for item in predictions])
root_mean_squared_error = math.sqrt(
metrics.mean_squared_error(predictions, targets))
print(" period %02d : %0.2f" % (period, root_mean_squared_error))
root_mean_squared_errors.append(root_mean_squared_error)
y_extents = np.array([0, sample[my_label].max()])
weight = linear_regressor.get_variable_value('linear/linear_model/%s/weights' % input_feature)[0]
bias = linear_regressor.get_variable_value('linear/linear_model/bias_weights')
x_extents = (y_extents - bias) / weight
x_extents = np.maximum(np.minimum(x_extents,
sample[my_feature].max()),
sample[my_feature].min())
y_extents = weight * x_extents + bias
plt.plot(x_extents, y_extents, color=colors[period])
print("Model training finished.")
plt.subplot(1, 2, 2)
plt.ylabel('RMSE')
plt.xlabel('Periods')
plt.title("Root Mean Squared Error vs. Periods")
plt.tight_layout()
plt.plot(root_mean_squared_errors)
calibration_data = pd.DataFrame()
calibration_data["predictions"] = pd.Series(predictions)
calibration_data["targets"] = pd.Series(targets)
display.display(calibration_data.describe())
print("Final RMSE (on training data): %0.2f" % root_mean_squared_error)
```

### Training: Achieve an RMSE of 180 or Below

Tweak the model hyperparameters to improve loss and better match the target distribution. If, after 5 minutes or so, you’re having trouble beating a RMSE of 180, check the solution for a possible combination.

```
train_model(
learning_rate=0.00002,
steps=500,
batch_size=3
)
```

Training model…

RMSE (on training data):

period 00 : 0.27

period 01 : 0.27

period 02 : 0.27

period 03 : 0.24

period 04 : 0.27

period 05 : 0.27

period 06 : 0.27

period 07 : 0.18

period 08 : 0.18

period 09 : 0.18

Model training finished.

predictions targets

count 17000.0 17000.0

mean 0.1 0.2

std 0.1 0.1

min 0.0 0.0

25% 0.0 0.1

50% 0.1 0.2

75% 0.1 0.3

max 2.2 0.5