Pandas – Filter with one criteria
Filter with one criteria
In this how-to we will discuss the filter capabilities in Pandas. Filtering in Pandas can be performed in a number of ways depending on several factors such as the size of the data, the granularity of the data the number of criteria specified and many more.
Let’s check out the dataset first: world_loans.head(). These are loans issued by the IBRD.
Dominica_loans = world_loans[world_loans['Country'] == 'Dominica']
-Display all loans issued to Dominica. The output should be in the format of an overview. First we creat a new variable and then set the criteria.
This can then be used to get the total loan amount.
Dominica_loans['Original Amount'].sum()
not_repaid = world_loans[world_loans['Repaid to IBRD'] > 0]
– Which project loans have not been Repaid back to IBRD.letter_B = world_loans[world_loans['Country'].str.startswith('B')]
– Filter all the countries whose name start with the letter ‘B’.date = world_loans[world_loans['First Repayment Date']> '1952-12-31']
– Show all the loans with a first repayment data after the 31 December 1952.
import pandas as pd
world_loans = pd.read_csv("https://raw.githubusercontent.com/naveen1973/data-analysis-and-visualization-using-python/master/IBRD06.csv")
world_loans = world_loans.set_index('Project ID')
world_loans['Original Amount'].astype(int)
world_loans['First Repayment Date'] = pd.to_datetime(world_loans['First Repayment Date'])
# What are the loans that the country of Dominica has oustanding?
Dominica_loans = world_loans[world_loans['Country'] == 'Dominica']
print (Dominica_loans[['Country', 'Original Amount']])
print ('Total loans add up to USD' + str(Dominica_loans['Original Amount'].sum()))
# Which countries have not repaid their loans to IBRD?
not_repaid = world_loans[world_loans['Repaid to IBRD'] == 0]
print (not_repaid[['Country', 'Original Amount']])
print('Total number of unpaid loans:')
print (len(not_repaid[['Country', 'Original Amount']]))
# Filter all countries whose name start with the letter 'B'.
letter_B = world_loans[world_loans['Country'].str.startswith('B')]
print('Countries whose name start with letter_B')
print(letter_B[['Country', 'Original Amount']])
# Show all loans with a repayment date after the 31st of December 2015
date = world_loans[world_loans['First Repayment Date']> '2005-12-31']
print('Loans with first repayment date after 31st December 2015')
print(date[['Country', 'Original Amount','First Repayment Date']])
Try yourself
Try out some code yourself.
import pandas as pd
world_loans = pd.read_csv("https://raw.githubusercontent.com/naveen1973/data-analysis-and-visualization-using-python/master/IBRD06.csv")
world_loans = world_loans.set_index('Project ID')
world_loans['Original Amount'].astype(int)
world_loans['First Repayment Date'] = pd.to_datetime(world_loans['First Repayment Date'])
# What are the loans that the country of Belgium has oustanding?
Belgium_loans =
# Which countries have a loan higher than USD10 million to IBRD?
10_million =
# Filter all countries whose name start with the letter 'M'.
letter_M
# Show all loans with the first repayment date before the 31st of December 1990
repayment_1990 =