Re: [R] Help to check data before putting it in a database

From: Jeff Newmiller <jdnewmil_at_dcn.davis.ca.us>
Date: Tue, 05 Apr 2011 08:36:28 -0700

I would recommend using R to check your input and identify bad input and to only load data that passes validation. Then go back to some other tool for editing the data and save/reload/reverify the edited data. The merge command with the all.x argument and is.na() can be used, or the ! and %in% logical operators can be used, to find non-matching values.

If you are determined to modify the data in R, then you probably need the tk library, the use of which is not really a topic for this forum.



Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil_at_dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k

Sent from my phone. Please excuse my brevity.

"Ulisses.Camargo" <moliterno.camargo_at_gmail.com> wrote:

The example scene: I have a database with stats about each goal made by my soccer team. This database (a data frame in R) is organized in lines (goals) with a set of columns containing data about these goals (player name, tactic position, etc). For now, this database will be called "data.frame1". What I need is to feed this "data.frame1" with new information about my team goals. I will call this new information "data.frame2". This set of new goals is organized in the same way as in "data.frame1" (equal numbers of cols). Where help is needed: I need help in finding a way to check the player-name column in "data.frame2" before feeding "data.frame1" with it. What I need is a way to verify the name of the player on each line of "data.frame2" with the names of players that already exist on a col in "data.frame1". Moreover, I need R to make two main things: First, the lines of “data.frame2” with player names that already exists in “data.frame1” must be added to “data.fram  e1”.

Second: lines of “data.frame2” with player names that does not exist on “data.frame1” must be listed in an output to be manually checked and corrected. After this verification, corrected lines and new-player-names lines must be incorporated in "data.frame1". What I want is to guarantee that will be no lines with wrong player names in my database. At the same time, my script must permit new information to be added (new player names). Is there somebody who could help me with this? Thanks for your attention Best wishes Ulisses -- View this message in context: http://r.789695.n4.nabble.com/Help-to-check-data-before-putting-it-in-a-database-tp3428318p3428318.html Sent from the R help mailing list archive at Nabble.com._____________________________________________
R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 05 Apr 2011 - 15:38:49 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 05 Apr 2011 - 15:40:27 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive