Re: [R] how to import such data to R?

From: John Fox <jfox_at_mcmaster.ca>
Date: Mon 17 Oct 2005 - 20:55:52 EST


Dear ronggui,

I didn't find any attachments, but using the data lines in your message, and assuming that . represents missing data, the following appears to do what you want:

as.data.frame(scan("c:/temp/ronggui.txt",

    list(year=1, apps=1, top25=1, ver500=1,

        mth500=1, stufac=1, bowl=1, btitle=1, finfour=1, lapps=1, d93=1,

        avg500=1, cfinfour=1, clapps=1, cstufac=1, cbowl=1, cavg500=1,  

        cbtitle=1, lapps_1=1, school="", ctop25=1, bball=1, cbball=1,),
        na.strings="."))

See ?scan for details.

I hope this helps,
 John

On Sat, 15 Oct 2005 15:57:42 +0800
 ronggui <042045003@fudan.edu.cn> wrote:
> the data file has such structure:
>
> 1992 6245 49 . . 20
> 1
> 0 0 8.739536 0 . .
> .
> . . . . .
> "alabama"
> . 0 .
> 1993 7677 58 . . 15
> 1
> 0 0 8.945984 1 . 0
> .2064476
> -5 0 . 0 8.739536
> "alabama"
> 9 0 0
> 1992 13327 57 36 58 16
> 0
> 0 0 9.497547 0 47 .
> .
> . . . 0 .
> "arizona"
> . 0 .
> 1993 19860 57 36 58 16
> 1
> 1 0 9.896463 1 47 0
> .3989162
> 0 1 0 1 9.497547
> "arizona"
> 0 1 1
> 1992 10422 37 28 58 20
> 0
> 0 0 9.251675 0 43 .
> .
> . . . -1 . "arizona
> state"
> . 0 .
>
> ------snip-----
>
> the data descriptions is:
>
> variable names:
>
> year apps top25 ver500 mth500 stufac bowl
> btitle
> finfour lapps d93 avg500 cfinfour clapps cstufac
> cbowl
> cavg500 cbtitle lapps_1 school ctop25 bball cbball
>
>
> Obs: 118
>
> 1. year 1992 or 1993
> 2. apps # applics for admission
> 3. top25 perc frosh class in 25th high sch
> percen
> 4. ver500 perc frosh >= 500 on verbal SAT
> 5. mth500 perc frosh >= 500 on math SAT
> 6. stufac student-faculty ratio
> 7. bowl = 1 if bowl game in prev year
> 8. btitle = 1 if men's cnf chmps prev year
> 9. finfour = 1 if men's final 4 prev year
> 10. lapps log(apps)
> 11. d93 =1 if year = 1993
> 12. avg500 (ver500+mth500)/2
> 13. cfinfour change in finfour
> 14. clapps change in lapps
> 15. cstufac change in stufac
> 16. cbowl change in bowl
> 17. cavg500 change in avg500
> 18. cbtitle change in btitle
> 19. lapps_1 lapps lagged
> 20. school university name
> 21. ctop25 change in top25
> 22. bball =1 if btitle or finfour
> 23. cbball change in bball
>
>
> so the each four lines represent one case,can some variables are
> numeric and some are character.
> I though the scan can read it in ,but it seems somewhat tricky as the
> mixed type of variables.any suggestions?
>
> the attachmen is the raw data and the description of the data.
>
>
> 2005-10-15
>
> ------
> Deparment of Sociology
> Fudan University
>
> My new mail addres is ronggui.huang@gmail.com
> Blog:http://sociology.yculblog.com



John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Mon Oct 17 21:03:50 2005

This archive was generated by hypermail 2.1.8 : Sun 23 Oct 2005 - 18:59:02 EST