Case Study (Human Resources Retention):

Checking and Replacing Missing Values

In [3]:
import pandas as pd
import numpy as np

df = pd.read_csv('hr_satisfaction.csv')

df.head()
Out[3]:
employee_id number_project average_montly_hours time_spend_company Work_accident left promotion_last_5years department salary satisfaction_level last_evaluation
0 1003 2 157 3 0 1 0 sales low 0.38 0.53
1 1005 5 262 6 0 1 0 sales medium 0.80 0.86
2 1486 7 272 4 0 1 0 sales medium 0.11 0.88
3 1038 5 223 5 0 1 0 sales low 0.72 0.87
4 1057 2 159 3 0 1 0 sales low 0.37 0.52

Checking Missing Values

In [4]:
df.isnull().sum()
Out[4]:
employee_id               0
number_project            0
average_montly_hours      0
time_spend_company        0
Work_accident             0
left                      0
promotion_last_5years     0
department                0
salary                    0
satisfaction_level       27
last_evaluation          27
dtype: int64

Replacing Missing Values with the aveage values

In [5]:
df.fillna(df.mean(), inplace=True)

df.isnull().sum()
Out[5]:
employee_id              0
number_project           0
average_montly_hours     0
time_spend_company       0
Work_accident            0
left                     0
promotion_last_5years    0
department               0
salary                   0
satisfaction_level       0
last_evaluation          0
dtype: int64