Case Study (autos) : Checking Missing Values

In [126]:
import pandas as pd
import numpy as np

df= pd.read_csv('carsinfo.csv')

Columns with missing values:

Showing all columns containing at least one missing value : Source

In [143]:
# Following columns have at leat one missing value

df.loc[:, df.isnull().any()][:6]
Out[143]:
normalized_losses bore stroke horsepower peak_rpm price
0 NaN 3.47 2.68 111.0 5000.0 13495.0
1 NaN 3.47 2.68 111.0 5000.0 16500.0
2 NaN 2.68 3.47 154.0 5000.0 16500.0
3 164.0 3.19 3.40 102.0 5500.0 13950.0
4 164.0 3.19 3.40 115.0 5500.0 17450.0
5 NaN 3.19 3.40 110.0 5500.0 15250.0

Number of missing values in each column:

Source of Function

In [134]:
def missing_values(dataframe):
    print (round((dataframe.isnull().sum()),2).sort_values(ascending=False))

missing_values(df)
normalized_losses    41
price                 4
stroke                4
bore                  4
peak_rpm              2
horsepower            2
length                0
make                  0
fuel_type             0
aspiration            0
num_of_doors          0
body_style            0
drive_wheels          0
engine_location       0
wheel_base            0
height                0
width                 0
highway_mpg           0
curb_weight           0
engine_type           0
num_of_cylinders      0
engine_size           0
fuel_system           0
compression_ratio     0
city_mpg              0
symboling             0
dtype: int64