PadhaiTime Logo
Padhai Time

Get Highest Record Per Category

In some of the situations, the data that we get contains multiple entries for an object.

 

For e.g.

  • There are multiple salary information available for a user
  • There are multiple phone numbers available for a user
  • There are redundant/duplicate records available for a user

 

In all of these situations, we may have to remove some of the rows and we have to make sure that we have a single entry per user. Hence in these cases, it becomes mandatory to be aware about some Python trick to achieve this.

Although there can be ‘n’ no. of ways for solving this problem, however, we are going to show two approaches.

 

Consider the data is available in below form where we have Population information for different states of Countries:

undefined

Expected Output:

Get the single row for each country where the population is highest.

undefined

   

Approach 1:

  • Sort the data based on country name and population in ascending order
  • Drop duplicate records for each Country while keeping the last record
  • All the rows will get deleted except the last one for each category

 

Code:

data.sort_values(['CountryName', 'Population'], ascending=True, inplace = True)
data.drop_duplicates(['CountryName'], keep="last", inplace = True)

 

Alternate Approach:

You can sort the data in descending order and can keep the first row as well. 

 

Code:

data.sort_values(['CountryName', 'Population'], ascending=False, inplace = True)
data.drop_duplicates(['CountryName'], keep="first", inplace = True)

 

Approach 2:

  • Group the data based on Country Name
  • Find the rows which have the highest population
  • Take the index numbers of those rows
  • Finally filter out the rows for those selected indexes

 

Code:

data.iloc[data.groupby('CountryName')['Population'].nlargest(1).index.get_level_values(1)]

 

Python has just lot of these one liner tricks to solve big big problems.

Bengaluru, India
contact.padhaitime@gmail.com
  • We collect cookies and may share with 3rd party vendors for analytics, advertising and to enhance your experience. You can read more about our cookie policy by clicking on the 'Learn More' Button. By Clicking 'Accept', you agree to use our cookie technology.
    Our Privacy policy can be found by clicking here