Re: [R] A comment about R:

From: Marwan Khawaja <mk36_at_aub.edu.lb>
Date: Wed 04 Jan 2006 - 20:47:55 EST


Dear Bob,
The reasons you mentioned are supposedly good features in R -- not giving lots of output you do not necessarily need. I guess the question is why do you want R to produce what you get from SPSS? SPSS is hardly a gold standard in statistical software.
But I agree that it is quite difficult for users of SPSS to unlearn SPSS (or SAS) while using R.

Best Marwan



Marwan Khawaja http://staff.aub.edu.lb/~mk36

> -----Original Message-----
> From: r-help-bounces@stat.math.ethz.ch
> [mailto:r-help-bounces@stat.math.ethz.ch] On Behalf Of Bob Green
> Sent: Wednesday, January 04, 2006 3:37 AM
> To: r-help@stat.math.ethz.ch
> Subject: Re: [R] A comment about R:
>
>
> >Hello,
>
>
> >Unlike most posts on the R mailing list I feel qualified to
> comment on
> >this one. For about 3 months I have been trying to learn
> use R, after
> >having used various versions of SPSS for about 10 years.
>
>
> I think it is far too simplistic to ascribe non-use of R to
> laziness. This may well be the case for some, however, I
> have read 5-6 books on R, waded through on-line resources,
> read the documentation and asked multiple questions via
> e-mails - and still find even some of the basics very difficult.
>
> There are several reasons for this:
>
> 1. For some tasks R is extremely user-unfriendly. Some
> comparative examples:
>
> (a) In running a chi-square analysis in SPSS the following
> syntax is included
>
> /STATISTIC=CHISQ
> /CELLS= COUNT EXPECTED ROW COLUMN TOTAL RESID .
>
> this produces expected and observed counts, row & column
> percentages, residuals, chi-square & Fisher's exact test +
> other output.
>
> In R, it is a herculean task to produce similar output . It
> certainly, can't be produced in 2 lines as far as I can tell.
>
> (b) in SPSS if I want to compare multiple variables by a
> single dependent variable this is readily performed
>
> CROSSTABS
> /TABLES=baserdis baserenh basersoc baseradd socbest
> disbest entbest addbest worsdis worsphy by group
>
> I used the chi-square example again, but the same applies for
> a t-test. I started looking into how to do something similar
> in R, with the t-test command but gave up. R does force the
> user to take a more considered approach to analysis.
>
> (c) To obtain a correlation matrix in R with the correlation
> & p-value is no simple task -
>
> In SPSS this is obtained via:
>
> GET
> FILE='D:\a study\data\dat\key data\master data.sav'.
> NONPAR CORR
> /VARIABLES= goodnum badnum good5 bad5 avfreq avdayamt
> /PRINT=KENDALL TWOTAIL
> /MISSING=PAIRWISE .
>
> In R something like this is required -
>
> > by(mydat, mydat$group, function(x) {
> + nm <- names(x)
> + rho <- matrix(, 6, 2)
> + rho.nm <- matrix(, 6, 2)
> + k <- 1
> + for(i in 2:4) {
> + for(j in (i + 1):5) {
> + x.i <- x[, i]
> + x.j <- x[, j]
> + ct <- cor.test(x.i, x.j, method=c("kendall") , alternative
> + =c("two-sided")) rho[k, 1] <- ct$estimate rho[k, 2] <-
> + round(ct$p-value, 3) rho.nm[k, ] <- c(nm[i], nm[j]) k <- k
> + 1 } } rho
> + <- cbind(as.data.frame(rho.nm), as.data.frame(rho))
> + names(rho) <- c("freq.i", "freq.j", "cor", "p-value") rho
> + })
>
> 2) It is not always clear what the output produced by R, is.
> The Mann-Whitney U-test is a good example. In R, it seems a
> standardised value is obtained. I was advised that it is easy
> enough to check this as R is open-source, but at least for
> me, I don't believe I would understand this code anyway. It
> is confusing when comparative programs such as R and SPSS
> produce dis-similar results. For the user it is important to
> be able to fairly easily reconcile such differences, to
> engender confidence in results.
>
> 3) I find the help files in R quite difficult to understand.
> For example, see help(t.test). It is almost assumed by the
> examples that you know what to do. Personally, I would find
> some form of simple decision tree easier -e.g. If you want to
> perform a t-test with the dependent variable in one column
> and the dependent use in another use t.test(AVFREQ~GROUP) .
> If you want to perform a t-test with the dependent variable
> in separate columns (each column representing a different
> group) use - t.test(AVFREQ1, AVFREQ2) .
>
> 4) My initial approach to using R, was to run commands I had
> used commonly in SPSS and compare the results. I have only
> got as far as basic ANOVA.
> This has been time-consuming and at times it has been
> difficult to obtain advice. Some people on the R list have
> been extremely generous with their time and knowledge, and I
> have much appreciated this assistance. At other times I see
> responses met with something like arrogance. With the
> sophistication of R, there is also an elitism. This is a
> barrier to R being more widely accepted and used.
>
> 5) differences in terminology - this is just part of the
> learning process, but I still found it took quite some time
> to work out simple commands and what different analyses were called.
>
> 6) system administrators may be wary of freeware.
>
> No doubt for the sophisticated user, my comments may seem
> trite and easily resolved, however I believe my comments have
> some relevance as to why R is not more readily used or accepted.
>
>
> Bob Green
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Wed Jan 04 21:01:20 2006

This archive was generated by hypermail 2.1.8 : Wed 04 Jan 2006 - 22:43:56 EST