import pandas as pd
import os
Import the required libraries
Create a simple customer information data-frame
= {
data 'given name': ['Josiah', 'Fred', 'Julie'],
'family name': ['Maangi', 'Juma', 'Musau'],
'phone':[25471234, 25472345,25473456],
'email': ['josiah@gml.com', 'fred@gml.com','julie@gml.com'],
'sex': ['M','M','F'],
'town': ['Nairobi', 'Kisumu', 'Mombasa']
}
= pd.DataFrame(data) customer_df
Here is how our small data-frame looks:
customer_df
given name | family name | phone | sex | town | ||
---|---|---|---|---|---|---|
0 | Josiah | Maangi | 25471234 | josiah@gml.com | M | Nairobi |
1 | Fred | Juma | 25472345 | fred@gml.com | M | Kisumu |
2 | Julie | Musau | 25473456 | julie@gml.com | F | Mombasa |
Make a copy of customer_df
= customer_df.copy() customer_df_copy
How I want to Rename my columns
I will rename my columns as below:
- given name to First Name
- family name to Last Name
- town to City
Rename a single column
To rename columns, we use the rename function from pandas as follows.
The column we want to rename(given name) will be on the left, and the new name we want to assign to this column(Family Name) will be on the right. The two names should be separated by a colon.
The inplace = True argument means this change happen with our customer_df_copy.
= {'given name': 'Family Name'}, inplace = True) customer_df_copy.rename(columns
Rename multiple columns at once
= {'given name': 'Family Name', 'town': 'City'}, inplace = True) customer_df_copy.rename(columns
Create a dictionary of column mappings from an excel/csv file
Imagine a scenario where you have to rename about 15 columns every time you receive files from a given customers. Using the above approach is not only manual but prone to errors. Since this is a repetitive process;
I will create a file with customer - company column mappings,
convert it into a dictionary, and
use it to rename files as below.
Code
# reading in my column mapping file
= pd.read_excel(r"C:\Users\ADMIN\Downloads\col_mapping.xlsx")
mapping
# creating a data dictionary
= mapping.set_index('customer_details')['company_mapping'].to_dict() col_mapping_dict
Rename using provided column mappings
# I will make a copy of my customer_df again
= customer_df.copy()
renamed_df
#then rename it
= col_mapping_dict, inplace = True) renamed_df.rename(columns
Here is a view of our customer_df vs our renamed_df.
given name | family name | phone | sex | town | |
---|---|---|---|---|---|
Josiah | Maangi | 25471234 | josiah@gml.com | M | Nairobi |
Fred | Juma | 25472345 | fred@gml.com | M | Kisumu |
Julie | Musau | 25473456 | julie@gml.com | F | Mombasa |
First Name | Last Name | Phone No | Email address | Gender | City |
---|---|---|---|---|---|
Josiah | Maangi | 25471234 | josiah@gml.com | M | Nairobi |
Fred | Juma | 25472345 | fred@gml.com | M | Kisumu |
Julie | Musau | 25473456 | julie@gml.com | F | Mombasa |