Re: [R] Bootstrap 95% confidence intervals for splines

From: Tim Hesterberg <timhesterberg_at_gmail.com>
Date: Sun, 27 Mar 2011 08:06:48 -0700

You're mixing up two concepts here,

First, to do a bootstrap confidence interval for a difference in predictions in the linear regression case, do:

repeat 10^4 times
  draw a bootstrap sample of the observations (subjects, keeping x & y together)   fit the linear regression to the bootstrap sample   record the difference in predictions at the two x values end loop
The bootstrap confidence interval is the range of the middle 95% of the recorded differences.

For a spline, the procedure is the same except for fitting a spline regression:

repeat 10^4 times
  draw a bootstrap sample of the observations (subjects, keeping x & y together)   fit the SPLINE regression to the bootstrap sample   record the difference in predictions at the two x values end loop
The bootstrap confidence interval is the range of the middle 95% of the recorded differences.

Tim Hesterberg

P.S. I think you're mixing up the response and explanatory variables. I'd think of eating hot dogs as the cause (explanatory variable), and waistline as the effect (response, or outcome).

P.P.S. I don't like the terms "independent" and "dependent" variables, as that conflicts with the concept of independence in probability. "Independent" variables are generally not independent, and the "dependent" variable may be independent of the others.

>There appear to be reports in the literature that transform continuous
>independent variablea by the use of splines, e.g., assume the dependent
>variable is hot dogs eaten per week (HD) and the independent variable is
>waistline (WL), a normal linear regression model would be:
>
>nonconfusing_regression <- lm(HD ~ WL)
>
>One might use a spline,
>
>confusion_inducing_regression_with_spline <- lm(HD ~ ns(WL, df = 4) )
>
>Now is where the problem starts.
>
>>From nonconfusing_regression , I get, say 2 added hot dogs per week for each
>centimeter of waistline along with a s.e. of 0.5 hot dogs per week, which I
>multiply by 1.96 to garner each side of the 95% c.i.
>If I want to show what the difference between the 75th percentile (say 100
>cm) and 25th percentile (say 80 cm) waistlines are, I multiply 2 by
>100-80=20 and get 40 hot dogs per week as the point estimate with a similar
>bumping of the s.e. to 10 hot dogs per week.
>
>What do I do to get the point estimate and 95% confidence interval for the
>difference between 100 cm persons and 80 cm persons with
>confusion_inducing_regression_with_spline ?
>
>Best regards.
>
>Mitchell S. Wachtel, MD



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sun 27 Mar 2011 - 15:11:51 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 28 Mar 2011 - 12:50:24 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive