# Re: [R] normalised/transformed regressions

From: Greg Snow <Greg.Snow_at_imail.org>
Date: Tue, 22 Jul 2008 09:42:56 -0600

It is possible to write a function to do what you describe, but the real question is why would you want to do that?

It looks like you are trying to force your data to fit a set of assumptions that are not needed. The normality assumption in regression models is that the residuals are normal, or that the y variable is conditionally normal given the x-values. There is no requirement that the raw y-values come from a normal distribution. And even then the normality assumption only applies to specific tests and is not needed just to fit the model and the central limit theorem applies to those tests, so they are still close approximations even when the residuals are not normal.

There is a derivation of the regression equations that assumes that the y variable and all the x's are from a multivariate normal distribution, but tranforming all the variables to have marginal normal distributions does not guarentee that they will be multivariate normal. And if you are using the fixed x formulation (both lead to the same set of equations), then there are no assumptions/requirements about the distribution of the x's (other than being non-unique).

If you tell us what you are trying to accomplish, we may be able to give better advice than to show you down the potentially wrong path.

```--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow_at_imail.org
(801) 408-8111

> -----Original Message-----

> From: r-help-bounces_at_r-project.org
> [mailto:r-help-bounces_at_r-project.org] On Behalf Of
> tolga.i.uzuner_at_jpmorgan.com
> Sent: Tuesday, July 22, 2008 7:50 AM
> To: r-help_at_r-project.org
> Subject: [R] normalised/transformed regressions
>
> Dear R Users,
>
> Are there any packages in R which carries out a normalisation
> to variables as follows:
> - find the empirical distribution function, using perhaps ecdf
> - use the empirical distribution function to transform the
> variables into a series between 0 and 1
> - use this series to map the variables into the normal
> distribution function, using qnorm
> - perform a regression on the transformed variables, which by
> construction will all be normally distributed
> - return some meaningful statistical test results and even
> better, a function which, given the independent variables,
> returns the dependent variable after inverting back through
> the transformed coefficients back into the original space
>
> Thanks in advance,
> Tolga
>
> Generally, this communication is for informational purposes
> only and it is not intended as an offer or solicitation for
> the purchase or sale of any financial instrument or as an
> official confirmation of any transaction. In the event you
> are receiving the offering materials attached below related
> to your interest in hedge funds or private equity, this
> communication may be intended as an offer or solicitation for
> the purchase or sale of such fund(s).  All market prices,
> data and other information are not warranted as to
> completeness or accuracy and are subject to change without notice.
> Any comments or statements made herein do not necessarily
> reflect those of JPMorgan Chase & Co., its subsidiaries and
> affiliates.
>
> This transmission may contain information that is privileged,
> confidential, legally privileged, and/or exempt from
> disclosure under applicable law. If you are not the intended
> recipient, you are hereby notified that any disclosure,
> copying, distribution, or use of the information contained
> herein (including any reliance
> thereon) is STRICTLY PROHIBITED. Although this transmission
> and any attachments are believed to be free of any virus or
> other defect that might affect any computer system into which
> it is received and opened, it is the responsibility of the
> recipient to ensure that it is virus free and no
> responsibility is accepted by JPMorgan Chase & Co., its
> subsidiaries and affiliates, as applicable, for any loss or
> damage arising in any way from its use. If you received this
> transmission in error, please immediately contact the sender
> and destroy the material in its entirety, whether in
> electronic or hard copy format. Thank you.
> Please refer to http://www.jpmorgan.com/pages/disclosures for
> disclosures relating to UK legal entities.
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
```
Received on Tue 22 Jul 2008 - 15:54:33 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 22 Jul 2008 - 17:31:56 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.