# Re: [R] missing values imputation

From: Prof Brian Ripley (ripley@stats.ox.ac.uk)
Date: Thu 13 May 2004 - 03:15:21 EST

```Message-id: <Pine.LNX.4.44.0405121812350.17465-100000@gannet.stats>

```

That's not an algorithm. It is a recipe for deriving an algorithm.

algorithm - A detailed sequence of actions to perform to accomplish some
task. Named after an Iranian mathematician, Al-Khawarizmi.

Technically, an algorithm must reach a result after a finite number of
steps, thus ruling out brute force search methods for certain problems,
though some might claim that brute force search was also a valid (generic)
algorithm. The term is also used loosely for any sequence of actions
(which may or may not terminate).

Paul E. Black's Dictionary of Algorithms, Data Structures, and Problems.

On Wed, 12 May 2004 Ted.Harding@nessie.mcc.ac.uk wrote:

> On 12-May-04 Rolf Turner wrote:
> > Anne Piotet wrote:
> >
> >> What R functionnalities are there to do missing values imputation
> >> (substantial proportion of missing data)? I would prefer to use
> >> maximum likelihood methods ; is the EM algorithm implemented? in
> >> which package?
> >
> > The so-called ``EM algorithm'' is ***NOT*** an
> > algorithm. It is a methodology or a unifying concept.
> > It would be impossible to ``implement'' it. (Except
> > possibly by means of some extremely advanced and
> > sophisticated Artificial Intelligence software.)
>
> Do we understand the same thing by "EM Algorithm"?
>
> The one I'm thinking of -- formulated under that name by Dempster,
> Laird and Rubin in 1977 ("Maximum likelihood estimation from incomplete
> data via the EM algorithm", JRSS(B) 39, 1-38) -- is indeed an algorithm
> in exactly the same sense as any iterative search for the maximum of a
> function.
>
> Essentially, in the context of data modelled by an underlying exponential
> family distribution where there is incomplete information about the
> values which have this distribution, it proceeds by
>
> Start: Choose starting estimates for the parameters of the distribution
> E: Using the current parameter values, compute the expected vaues
> of the sufficient statistics conditional on the observed information
> M: Solve the maximum-likelihood equations (which are functions of the
> sufficient statistics) using the expected values computed in (E)
> If sufficently converged, stop. Otherwise, make the current parameter
> values equal to the values estimated in (M) and return to (E).
>
> Algorithm, this, or not????
>
> And where does "extremely advanced and sophisticated Artificial
> Intelligence software" come into it? You can, in some cases, perform
> the above EM algorithm by hand.
>
> Which "EM Algorithm" are you thinking of?
>
> Best wishes,
> Ted.
>
>
> --------------------------------------------------------------------
> E-Mail: (Ted Harding) <Ted.Harding@nessie.mcc.ac.uk>
> Fax-to-email: +44 (0)870 167 1972
> Date: 12-May-04 Time: 17:57:53
> ------------------------------ XFMail ------------------------------
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
>

```--
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595
______________________________________________
R-help@stat.math.ethz.ch mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
```

This archive was generated by hypermail 2.1.3 : Mon 31 May 2004 - 23:05:09 EST