What is the easiest way to calculate a percent rank “by” an index key?
Foe example, I have a dataset with 3 fields:
Year, State, Income ,
I wish to calculate the rank, by year, by state.
I also wish to calculate the “percent rank”, where I define percent rank as rank/n.
(n is the number of numeric data points within each date-state grouping.)
This is what I am currently doing:
- I create a “group by” field by using the paste function to combine date and state into a field called date_state. I then use the rank function to calculate the rank by date, by state.
- I then add a field called “one” that I set to 1 if the value in income is numeric and to 0 if it is not.
- I then take an aggregate sum of “one”. This gives me a count (n) for each date-state grouping.
- I next use merge to add this count to the table.
- Finally, I calculate the percent rank.
Pr<-rank/n
The merge takes quite a bit of time to process.