1. Renaming Columns¶
new_column_name = { 'ano' : 'year' , 'mes': 'month', 'estado': 'state',
'numero': 'fire_numbers','encontro': 'date'}
df.rename(columns= new_column_name , inplace=True)
2. Re-arranging Column Order¶
new_column_order = [4,1,0,2,3,]
df = df[df.columns[new_column_order]]
3. Checking data types of specific columns¶
df['fire_numbers'].str.isdigit()
df['fire_numbers'].str.isnumeric()
-It returns AttributeError: Can only use .str accessor with string values!
-To fix this error -> we need to convert column 'fire_numbers' datatype to String
df['fire_numbers'].astype(str).str.isdigit()
df['fire_numbers'].astype(str).str.isnumeric()
4. Removing Text from column values¶
df['fire_numbers'] = df['fire_numbers'].str.strip("Fires")
5. Dealing with Missing Data¶
1. Checking Missing Values -> df.isnull().sum()
2. Decide what to do with Missing Values :
- A. Remove them using -> .dropna(axis=0)
- axis {0 or ‘index’, 1 or ‘columns’}, default 0
0, or ‘index’ : Drop rows which contain missing values
1, or ‘columns’ : Drop columns which contain missing value
- B. Replace them with average of column
DataFrame.fillna(method=None)
- C. Replace them with zeros, or Forward Fill (ffill) or Back Fill (backfill)
1.Replace with zeros
f['number_of_fires'].fillna(0)
2. Replace with Forward Fill (ffill)
Forward Fill df.ffill(axis = 0) -> any missing value is filled with value in the previous row
Forward Fill df.ffill(axis = 1) -> any missing value is filled with value in the previous column
3. Replace with Back Fill (backfill)
Back Fill df.bfill(axis = 0) -> any missing value is filled with value in the next row
Back Fill df.bfill(axis = 1) -> any missing value is filled with value in the next column
- Here, we dropped Missing Values:
df.dropna(axis=0)
6. Changing Datatype of specified column¶
df['fire_numbers'] = df['fire_numbers'].astype(float)
7. Replacing a column value with a new value¶
Month_in_English = {'Janeiro': 'January',
'Fevereiro': 'February',
'Março': 'March',
'Abril': 'April',
'Maio': 'May',
'Junho': 'June',
'Julho': 'July',
'Agosto': 'August',
'Setembro': 'September',
'Outubro': 'October',
'Novembro': 'November',
'Dezembro': 'December'}
df["month"] = df["month"].map(Month_in_English)
8. Converting characters of column values to¶
.str.lower: Converts all characters to lowercase
.str.upper: Converts all characters to uppercase
.str.title: Converts first character of each word to uppercase and remaining to lowercase
.str.capitalize: Converts first character to uppercase and remaining to lowercase
.str.swapcase: Converts uppercase to lowercase and lowercase to uppercase
.str.casefold: Removes all case distinctions in the string
df['state'] = df['state'].str.title()
9. Removing Columns¶
df = df.drop(["month" , "year"], axis=1)
10. Removing Rows¶
- After removing rows you can reset indexes with -> df = df.reset_index(drop=True)
1. Dropping a Row with specified index
df = df.drop(df.index[1])
2. Dropping Multiple Rows with specified indexes
df = df.drop(df.index[[2,3]])
3. Drop A Range of Rows with specified range
df = df.drop(df.index[[1,4]])