Pandas – sorting

team-member-1

Rishi Sapra

Technical Community leader, speaker, trainer and evangelist specialising in Power BI and Azure. Formally recognised by Microsoft as a Most Valuable Professional (MVP), Fast Track Recognised Solution Architect (FTRSA) and Microsoft Certified Trainer (MCT).

Tags:
Warning: Invalid argument supplied for foreach() in /home/customer/www/learndatainsights.com/public_html/wp-content/themes/twentytwentyone-child/single.php on line 152

Pandas - sorting

Sorting by rows

Pandas has good sorting capabilities. When datasets are very large you want to sort it even by multiple columns.
To do this Pandas, there are the following ways:

  1. sort_values(): sort by one or more columns.
  2. sort_index(): to sort by row index.

Both of these functions have the following options

  • sorting the data frame in specific order (ascending or descending)
  • sorting in place, sorting with missing values
  • what to do with missing values.

We will be working with the dataset ‘world_loans’ to give some examples.

  • world_loans.sort_values(['Country']) – sorts the data frame alphabetically by Country ascending, which is the default.
  • world_loans.sort_values(['Country'], ascending=False)– sorts the data frame alphabetically by Country, descending.

In our dataset we have multiple columns. Check the columns in the dataset: world_loans.columns. Suppose we want to sort by the columns ‘Country’ (descending) and ‘Original Amount’ (ascending).

world_loans.sort_values(['Country', 'Original Amount'], ascending=[False, True])

    – the dataset is now first ordered by country and within each country the original loan amount is ordered.

If we want to make a change to the dataset permanently we use ‘inplace = True’. Like so:
world_loans.sort_values(['Country'], ascending=False, inplace = True) – this changes the dataset.

Missing values
Sometimes your dataset has values that are not available. What do you want to do with that? Pandas give you the option to choose whether you want the missing values, NaN, at the beginning or in the end of the column.
na_position : [{‘first’, ‘last’}, default ‘last’]: ‘first’ puts ‘NaNs’ at the beginning while ‘last’ puts NaNs at the end.
An example
world_loans.sort_values(['Country', na_position : ‘first’])– This will sort the dataset by the ‘Country’ column and will show all the rows that do not have country name defined first.


import pandas as pd
world_loans = pd.read_csv("https://raw.githubusercontent.com/naveen1973/data-analysis-and-visualization-using-python/master/IBRD06.csv")
world_loans = world_loans[['Project ID','Country', 'Guarantor','Original Amount','Type']]
world_loans = world_loans.set_index('Project ID')
world_loans['Original Amount'].astype(int)


sort_by_country =world_loans.sort_values(['Country'])
print ('Sort the dataset by Country and display the first 10 rows.')
print (sort_by_country[['Country', 'Guarantor', 'Type']].head(10))
print('')
print('')
sort_by_country_asc= world_loans.sort_values(['Country'], ascending=False)
print ('Sort the dataset in descending order by Country and display the first 10 rows.')
print (sort_by_country_asc.head(10))
print('')
print('')
sort_by_two_column_criteria = world_loans.sort_values(['Country', 'Original Amount'], ascending=[False, True])
print ('Sort the dataset in descending order by Country and ascending order by original amount. Then, display the first 10 rows.')
print(sort_by_two_column_criteria[['Country', 'Guarantor', 'Original Amount']].head(10))


 

Try Yourself

 

Try out some code yourself.

 


import pandas as pd
world_loans = pd.read_csv("https://raw.githubusercontent.com/naveen1973/data-analysis-and-visualization-using-python/master/IBRD06.csv")
world_loans = world_loans[['Project ID','Status', 'Borrower','Repaid to IBRD','Type']]
world_loans = world_loans.set_index('Project ID')


#check out the new dataframe
world_loans.head()
#Sort the world_loans dataset by 'Borrower' AND 'Status'. Give this dataframe a new name called 'status'




Want to provide feedback on this blog post or enquire about having training for your company/team? Fill in the form below and let us know!

Leave a comment