Re: [R] Help with parsing a data file

From: Peter Alspach <PAlspach_at_hortresearch.co.nz>
Date: Fri, 07 Mar 2008 08:41:19 +1300

Sean

I'm sure there are many ways of doing this. I assume you have read the data in R as a data.frame with 24 columns and 2+1+13+(1+13)*n rows, where n is the number of years, and that you want a data.frame with 25 columns (one extra for year) and 13*n rows (although I am not sure why 13 MOnths) and the columns named appropriately.

#First create your new data.frame
newDF <- as.data.frame(matrix(NA, (nrow(oldDF)-15)/14, 25,

                       dimnames=list(NULL, c('year', oldDF[3,]))))
#Now fill the year column (column 1)
newDF[,1] <- rep(oldDf[seq(16, nrow(oldDF), 14),1], each=13) #And finally deal with the data
newDF[,-1] <- oldDF[-(1:15),][c(F, rep(T,13)),]

The above is untested and not guaranteed to work, but hopefully is enough to get you going. If not, you could get back to me privately.

Peter Alspach

> -----Original Message-----
> From: r-help-bounces_at_r-project.org
> [mailto:r-help-bounces_at_r-project.org] On Behalf Of sean
> Sent: Friday, 7 March 2008 8:06 a.m.
> To: r-help_at_r-project.org
> Subject: [R] Help with parsing a data file
>
> Hi All,
>
> I need to parse data from a file, example shown below. The
> first two lines can be skipped, the third line contains the
> column names. The next 13 lines can be skipped. The next
> line "1991" is a year value, with the following 13 values
> data for that year. The file then repeats this format with
> (year, 13 lines of data for that year). I would ideally like
> to end up with an array/list/vector of the block of 13
> values, indexed by year, each block using the column names
> given on the third line.
>
> If anyone has any good ideas on how to do this in R, pls. let me know.
>
> Thanks,
> Sean
>
> --------------------------------------------------------------
> --------------------------------------------------------------
> -----------------------------------------------------------
> 725280 BUFFALO NIAGARA INTL A NY -5 N42 56 W078 44 215 988
> 1991-2005
> MO AVGLO FL SDGLO AVDIR FL SDDIR AVDIF FL SDDIF AVETR AETRN TOT OPQ
> H2O TAU MAX_T MIN_T AVG_T AVGDT RH HTDD CLDD AVWS
> 1 1336 K5 222 1534 K7 676 837 K5 72 3806 13256
> 8.4 8.1 0.83
> 0.09 -0.52 -7.40 -3.86 -3.36 75 691 0 5.5
> 2 2261 K5 400 2691 K7 1026 1129 K5 74 5330 14714
> 7.6 7.1 0.79
> 0.10 0.97 -6.67 -2.76 -1.85 73 599 0 5.1
> 3 3249 K5 413 3207 K7 852 1578 K5 118 7428 16443
> 7.2 6.7 0.98
> 0.12 4.96 -3.21 0.93 2.04 71 541 0 4.9
> 4 4460 K5 570 4051 K6 1045 1951 K5 130 9509 18140
> 6.6 6.1 1.33
> 0.13 12.18 2.68 7.39 8.54 67 328 1 4.7
> 5 5484 K5 518 4529 K6 801 2408 K5 142 10999 19523
> 6.1 5.4 1.87
> 0.15 18.77 8.68 13.83 15.07 69 154 12 4.5
> 6 6046 K5 383 5011 K6 671 2567 K5 166 11616 20177
> 5.7 4.9 2.64
> 0.16 24.05 14.52 19.47 20.66 71 34 63 4.1
> 7 5793 K5 529 4734 K6 884 2537 K5 127 11250 19734
> 5.6 4.9 2.97
> 0.16 26.10 16.92 21.70 22.90 71 5 104 4.1
> 8 5057 K5 417 4390 K6 693 2245 K5 94 9974 18430
> 5.6 4.9 2.92
> 0.15 25.61 16.37 21.10 22.56 73 9 91 3.6
> 9 4001 K5 458 3864 K6 826 1797 K5 105 8078 16803
> 5.6 5.0 2.36
> 0.13 21.73 12.20 17.06 18.68 73 71 30 3.9
> 10 2502 K5 254 2564 K7 584 1306 K5 88 5948 15098
> 6.3 5.8 1.67
> 0.11 14.89 6.40 10.71 12.16 72 241 3 4.3
> 11 1395 K5 198 1394 K7 492 887 K5 47 4170 13545
> 7.9 7.5 1.30
> 0.10 8.37 1.42 4.91 5.82 73 403 0 5.0
> 12 1120 K5 173 1391 K7 475 701 K5 52 3351 12733
> 7.9 7.7 0.94
> 0.09 2.41 -3.92 -0.70 -0.03 75 592 0 5.0
> 13 3559 K5 201 3280 K7 383 1662 K5 50 7622 16550
> 6.7 6.2 1.72
> 0.12 13.29 4.83 9.15 10.27 72 3668 304 4.5
> 1991
> 1 1313 I5 637 1374 I6 1636 832 I5 169 3800 13249
> 8.2 7.8 0.75
> 0.07 -0.09 -6.67 -3.46 -2.94 73 673 0 5.9
> 2 1875 I5 887 1767 I6 2080 1137 I5 263 5310 14694
> 8.3 7.6 0.85
> 0.08 2.44 -3.84 -0.61 0.15 73 533 0 5.9
> 3 3205 I5 1520 3371 I6 3133 1458 I5 392 7395 16417
> 6.7 6.1 1.12
> 0.10 7.23 -1.17 2.75 3.75 70 474 0 5.3
> 4 3999 I5 1911 3451 I6 3501 1918 I5 521 9482 18116
> 6.9 5.9 1.60
> 0.12 14.46 5.65 9.91 11.04 68 250 2 5.4
> 5 5968 I5 1854 5369 I6 2936 2296 I5 437 10983 19506
> 6.1 4.5 2.46
> 0.14 23.15 12.56 17.85 19.12 68 81 66 4.8
> 6 6988 I5 1577 6761 I6 2983 2288 I5 604 11614 20176
> 4.8 3.0 2.42
> 0.15 26.09 14.95 20.80 22.28 64 14 80 4.3
> 7 6364 I5 1538 5779 I6 2799 2404 I5 568 11262 19749
> 5.0 3.7 2.89
> 0.16 27.17 16.96 22.43 23.77 66 1 116 4.4
> 8 5407 I5 1478 5114 I6 2693 2106 I5 527 9999 18451
> 4.8 4.0 2.91
> 0.18 26.64 16.49 21.44 23.08 73 2 102 4.2
> 9 4482 I5 1010 4126 I6 1953 2033 I5 415 8109 16830
> 5.8 4.6 2.24
> 0.19 22.05 10.98 16.67 18.53 66 97 42 4.3
> 10 2534 I5 864 2419 I6 1859 1396 I5 289 5978 15123
> 6.1 5.3 1.83
> 0.20 16.44 6.92 11.72 13.21 72 213 7 4.4
> 11 1264 I5 716 1059 I6 1733 851 I5 206 4190 13565
> 8.3 8.0 1.33
> 0.21 7.63 0.44 3.94 4.82 77 429 0 5.1
> 12 976 I5 423 826 I6 1172 714 I5 156 3354 12738
> 7.6 7.2 0.98
> 0.22 3.40 -4.20 -0.34 0.21 78 581 0 5.6
> 13 3698 I5 2146 3451 I6 2002 1619 I5 629 7623 16551
> 6.6 5.6 1.78
> 0.15 14.72 5.76 10.26 11.42 71 3347 415 5.0
> 1992
> 1 1149 I5 496 701 I6 919 896 I5 231 3791 13236
> 8.5 8.1 0.84
> 0.24 0.68 -6.20 -2.60 -1.69 79 654 0 5.4
> 2 1580 I5 708 898 I6 1469 1198 I5 255 5328 14708
> 8.2 7.7 0.86
> 0.27 1.26 -6.19 -2.40 -1.56 78 603 0 4.7
> 3 2968 I5 1429 2145 I6 2037 1760 I5 452 7449 16457
> 7.3 6.7 0.97
> 0.29 3.82 -4.35 -0.11 1.01 70 577 0 4.8
> 4 4050 I5 1812 2937 I6 2634 2146 I5 404 9527 18154
> 7.3 6.4 1.41
> 0.29 10.64 2.33 6.40 7.50 71 356 1 4.1
> 5 5654 I5 1935 4311 I6 2843 2695 I5 557 11009 19528
> 5.4 4.3 1.74
> 0.29 19.79 8.17 14.13 15.66 66 148 13 3.8
> 6 6170 I5 2120 4608 I6 3029 2877 I5 695 11617 20176
> 5.3 4.1 2.17
> 0.28 22.76 11.91 17.63 19.06 65 56 26 4.0
> 7 4879 I5 1816 2795 I6 1915 2835 I5 595 11242 19729
> 7.2 6.3 2.99
> 0.27 23.05 15.33 19.23 20.19 75 18 44 4.4
> 8 5168 I5 1720 4473 I6 2922 2256 I5 444 9959 18415
> 5.8 4.9 2.64
> 0.24 23.28 14.62 19.05 20.37 72 26 46 4.4
> 9 4094 I5 1361 3741 I6 2382 1893 I5 375 8058 16789
> 5.8 4.5 2.50
> 0.21 21.25 11.59 16.56 18.13 72 84 27 4.5
> 10 2499 I5 1177 2228 I6 1904 1393 I5 311 5928 15081
> 6.1 5.6 1.45
> 0.18 13.37 4.21 8.81 10.50 70 296 0 4.7
> 11 1134 I5 680 731 I6 1287 849 I5 249 4156 13533
> 8.5 8.1 1.39
> 0.15 7.58 1.22 4.38 5.30 77 418 0 4.8
> 12 1048 I5 508 1136 I6 1428 687 I5 146 3348 12729
> 7.8 7.3 0.96
> 0.14 3.30 -3.43 -0.06 0.83 69 570 0 5.1
> 13 3366 I5 1883 2559 I6 1489 1790 I5 790 7618 16545
> 6.9 6.1 1.66
> 0.24 12.56 4.10 8.42 9.61 72 3806 157 4.6
> ...
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

The contents of this e-mail are privileged and/or confidential to the named  recipient and are not to be used by any other person and/or organisation.  If you have received this e-mail in error, please notify the sender and delete  all material pertaining to this e-mail.



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 06 Mar 2008 - 19:49:16 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 06 Mar 2008 - 20:30:19 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive