Re: [R] Labeling a range of bars in barplot?

From: Dan Bolser <dmb_at_mrc-dunn.cam.ac.uk>
Date: Thu 15 Dec 2005 - 04:16:12 EST

Marc Schwartz (via MN) wrote:

```> On Tue, 2005-12-13 at 10:53 +0000, Dan Bolser wrote:
>
```

>>Hi, I am plotting a distribution of (ordered) values as a barplot. I
>>would like to label groups of bars together to highlight aspects of the
>>distribution. The label for the group should be the range of values in
>>those bars.
>>
>>As this is hard to describe, here is an example;
>>
>>
>>x <- rlnorm(50)*2
>>
>>barplot(sort(x,decreasing=T))
>>
>>y <- quantile(x, seq(0, 1, 0.2))
>>
>>y
>>
>>plot(diff(y))
>>
>>
>>
>>That last plot is to highlight that I want to label lots of the small
>>columns together, and have a few more labels for the bigger columns
>>(more densely labeled). I guess I will have to turn out my own labels
>>using low level plotting functions, but I am stumped as to how to
>>perform the calculation for label placement.
>>
>>I imagine drawing several line segments, one for each group of bars to
>>be labeled together, and putting the range under each line segment as
>>the label. Each line segment will sit under the group of bars that it
>>covers.
>>
>>Thanks for any help with the above!
>>
>>Cheers,
>>Dan.
```>
>
> Dan,
>
> Here is a hint.
>
> barplot() returns the bar midpoints:
>
> mp <- barplot(sort(x, decreasing = TRUE))
>
>
```

```>
>      [,1]
> [1,]  0.7
> [2,]  1.9
> [3,]  3.1
> [4,]  4.3
> [5,]  5.5
> [6,]  6.7
>
> There will be one value in 'mp' for each bar in your series.
>
> You can then use those values along the x axis to draw your line
> segments under the bars as you require, based upon the cut points you
> want to highlight.
>
> To get the center of a given group of bars, you can use:
>
>   mean(mp[start:end])
>
> where 'start' and 'end' are the extreme bars in each of your groups.
>
> Two other things that might be helpful. See ?cut and ?hist, noting the
> output in the latter when 'plot = FALSE'.
>
> HTH,

```

Thanks all for help on this question, including those who emailed me off list.

I went with the suggestion of Marc above, because I could follow through how to implement the code (other more complete solutions were hard for me to 'reverse engineer').

Here is my solution in full, which I feel gives rather nice output :)

## Approximate my data for you to try
x <- sort((runif(70)*100)^3,decreasing=T)

## Plot the barplot
mp <-

barplot(x,

```           # Remove default label names
names.arg=rep('',70)
)

```

## Break data range, and count bars per break my.hist <-

hist(x,plot=F,

```        ## Pick the (approximate) number of labels
## NB: using quantiles is incorrect here
breaks=4
)

```

## Check for sanity
## points(mp[length(mp)],x[length(mp)],col=2)

## Counts become new 'breaks'
my.new.breaks <-

my.hist\$counts

## Some formating stuff
my.names <-

sprintf("%.1d",my.hist\$breaks)

op<-par(xpd=TRUE)

i <- length(mp) # Note we label from right to left q <- 1
#
for(j in my.new.breaks){

```   st <- i                   #
en <- i-j+1               #
```

##
segments(mp[st],-50000,

mp[en],-50000,lwd=2,col=2)
##
text(mean(mp[st:en]),-100000,pos=1,

```        paste(paste(my.names[q],"-",sep=" "),
my.names[q+1],sep="\n"),cex=0.6)
##
i <- i-j                  #
```

q <- q+1
}

You should see that the density of labels corresponds to the range of data (hopefully not too dense), giving more labels to regions of the plot with bigger ranges.

```> Marc Schwartz
>
>

```

Cheers,
Dan.

R-help@stat.math.ethz.ch mailing list