- Importing the Relevant Libraries
- Loading the Data
- Declaring the Dependent and the Independent variables
- Linear Regression Model
- Creating a Linear Regression
- Reshaping x into a matrix (2D object)
- Fitting The Model
- Calculating the R-squared
- Finding the intercept
- Finding the coefficients
- Making predictions
1. Adding New Apartments
2. Predicting Price of New Apartments
3. Creating Summary Table
- Plotting a Scatter Plot
- Plotting Regression Line
- Finding Coefficient & Intercept
- Calculating yhat
- Plotting Regression Line
Note: the dependent variable is 'price' & the independent variable is 'size'
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()
from sklearn.linear_model import LinearRegression
url = "https://datascienceschools.github.io/real_estate_price_size.csv"
df = pd.read_csv(url)
df.head()
- x : (Independent variable)-> Input or Feature
- y : (dependent variable)-> Output or Target
x = df['size']
y = df['price']
print(x.shape)
print(y.shape)
model = LinearRegression()
- Reshaping input into a matrix (two dimensional array) before fitting the model
x = x.values.reshape(-1,1)
x.shape
model.fit(x,y)
RSquared = model.score(x,y)
print('R-Squared is:', RSquared)
Intercept = model.intercept_
print('Intercept is:', Intercept)
Coefficient = model.coef_
print('coefficient is:', Coefficient)
- What should be the price of a apartment with a size of 500, 750 & 1000 sq.ft?
new_apartment = pd.DataFrame({'size': [500,750,1000]})
new_apartment
model.predict(new_apartment)
new_apartment['predicted_price'] = model.predict(new_apartment)
new_apartment
- Positive linear relationship between Size & Price
plt.scatter(x,y)
plt.xlabel('Size',fontsize=20)
plt.ylabel('Price',fontsize=20)
plt.show()
1. Finding Coefficient & Intercept
2. Calculating yhat
3. Plotting Regression Line
Coefficient = model.coef_
Intercept = model.intercept_
print("Coeficient is:", Coefficient, '\n Intercept is:', Intercept)
yhat = Coefficient * x + Intercept
plt.scatter(x,y)
plt.xlabel('Size', fontsize = 20)
plt.ylabel('Price', fontsize = 20)
yhat = Coefficient * x + Intercept
plt.plot(x, yhat, lw = 4, c ='red', label ='regression line')
plt.show()