Pandas – Filter with multiple criteria
Filter with more criteria
When the dataset is large we most often need more filter criteria to bring down the size of the subset. Pandas offers a number of ways to to filter down a large set to a smaller set.
We are using our previous dataset again: world_loans.head(). Loans issued by the IBRD.
Dominica_loans = df[(df['Country'] == 'Dominica') & (df['Original Amount'] < 1000000)]
-Display all loans issued to Dominica that have an original amount of higher thatn USD1m. The output should be in the format of an overview. First we creat a new variable and then set the criteria.not_repaid_by_Belgium = world_loans[(world_loans['Repaid to IBRD'] > 0) & (world_loans['Country'] == 'Belgium')]
– Which project loans have not been repaid back to IBRD by Belgium.letter_F_Transport= world_loans[world_loans['Country'].str.startswith('F') & (world_loans['Project Name'] == 'TRANSPORT')]
– Filter all the countries whose name start with the letter ‘F’ and borrowed funds to fransport projects.date_guarantor = world_loans[(world_loans['First Repayment Date']< '2000-12-31') & (world_loans['Guarantor'] == 'BE')]
– Show all the loans with a first repayment data before 2001 where the guarantor is Belgium (BE).
import pandas as pd
world_loans = pd.read_csv("https://raw.githubusercontent.com/naveen1973/data-analysis-and-visualization-using-python/master/IBRD06.csv")
world_loans = world_loans.set_index('Project ID')
world_loans['Original Amount'].astype(int)
world_loans['First Repayment Date'] = pd.to_datetime(world_loans['First Repayment Date'])
world_loans['Last Repayment Date'] = pd.to_datetime(world_loans['Last Repayment Date'])
world_loans['Agreement Signing Date'] = pd.to_datetime(world_loans['Agreement Signing Date'])
Dominica_loans = world_loans[(world_loans['Country'] == 'Dominica') & (world_loans['Original Amount'] < 1000000)]
print (Dominica_loans[['Country', 'Original Amount']])
# Which project loans have not been repaid back to IBRD by Belgium.
not_repaid_by_Belgium = world_loans[(world_loans['Repaid to IBRD'] > 0) & (world_loans['Country'] == 'Belgium')]
print (not_repaid_by_Belgium[['Country', 'Original Amount']])
print('Total number of unpaid loans by Belgium:')
print (len(not_repaid_by_Belgium[['Country', 'Original Amount']]))
#Filter all the countries whose name start with the letter 'F' and borrowed funds to fransport projects.
letter_F_Transport= world_loans[world_loans['Country'].str.startswith('F') & (world_loans['Project Name'] == 'RECONSTRUCTION')]
print('Countries whose name start with letter_F')
print(letter_F_Transport[['Country', 'Original Amount']])
# - Show all the loans with a first repayment data before 2001 where the guarantor is Belgium (BE).
date_guarantor = world_loans[(world_loans['First Repayment Date']< '2000-12-31') & (world_loans['Guarantor'] == 'BE')]
print('Loans with first repayment date after 31st December 2000 guaranteed by Belgium')
print(date_guarantor[['Country', 'Original Amount','First Repayment Date']])
Try Yourself
Try out some code yourself.
import pandas as pd
world_loans = pd.read_csv("https://raw.githubusercontent.com/naveen1973/data-analysis-and-visualization-using-python/master/IBRD06.csv")
world_loans = world_loans.set_index('Project ID')
world_loans['Original Amount'].astype(int)
world_loans['First Repayment Date'] = pd.to_datetime(world_loans['First Repayment Date'])
# Display all loans issued to Dominica that have an original amount of higher than USD1m.
Dominica_loans = world_loans[(world_loans['Country'] == 'Dominica') & (world_loans['Original Amount'] < 1000000)]
print (Dominica_loans[['Country', 'Original Amount']])
# Which project loans have not been repaid back to IBRD by Belgium.
not_repaid_by_Belgium = world_loans[(world_loans['Repaid to IBRD'] > 0) & (world_loans['Country'] == 'Belgium')]
print (not_repaid_by_Belgium[['Country', 'Original Amount']])
print('Total number of unpaid loans by Belgium:')
print (len(not_repaid_by_Belgium[['Country', 'Original Amount']]))
#Filter all the countries whose name start with the letter 'F' and borrowed funds to fransport projects.
letter_F_Transport= world_loans[world_loans['Country'].str.startswith('F') & (world_loans['Project Name'] == 'RECONSTRUCTION')]
print('Countries whose name start with letter_F')
print(letter_F_Transport[['Country', 'Original Amount']])
# Show all the loans with a first repayment data before 2001 where the guarantor is Belgium (BE).
date_guarantor = world_loans[(world_loans['First Repayment Date']< '2000-12-31') & (world_loans['Guarantor'] == 'BE')]
print('Loans with first repayment date after 31st December 2000 guaranteed by Belgium')
print(date_guarantor[['Country', 'Original Amount','First Repayment Date']])