[R] Need expert help with model.matrix

From: Axel Urbiz <axel.urbiz_at_gmail.com>
Date: Wed, 18 May 2011 07:15:33 -0400

Dear experts:

Is it possible to create a new function based on stats:::model.matrix.default so that an alternative factor coding is used when the function is called instead of the default factor coding?

Basically, I'd like to reproduce the results in 'mat' below, without having to explicitly specify my desired factor coding (identity matrices) in the 'contrasts.arg'.

dd <- data.frame(a = gl(3,4), b = gl(4,1,12))
ca <- contrasts(dd$a, contrasts= FALSE)  # 3 x 3 identity matrix
cb <- contrasts(dd$b, contrasts= FALSE)  # 4 x 4 identity matrix
mat <- model.matrix(~ a + b, dd, contrasts.arg = list(a=ca, b=cb))

My approach was to modify the code in model.matrix by explicitly setting the contrasts argument in the contr.identity and contrasts function to FALSE. This is shown at the bottom of the email in the function model.matrix2:

contr.identity <- contr.treatment
formals(contr.identity)$contrasts <- FALSE

contrasts <- contrasts
formals(contrasts)$contrasts <- FALSE

However, I believe this function is using contrasts = TRUE, as it doesn't return the identity contrasts
mat2 <- model.matrix2(~ a + b, dd)

Any help here is much appreciated.

model.matrix2 <-

function (object, data = environment(object), contrasts.arg = NULL,

    xlev = NULL, ...)

    t <- if (missing(data))

    else terms(object, data = data)
    if (is.null(attr(data, "terms")))

        data <- model.frame(object, data, xlev = xlev)     else {

        reorder <- match(sapply(attr(t, "variables"), deparse,
            width.cutoff = 500)[-1L], names(data))
        if (any(is.na(reorder)))
            stop("model frame and formula mismatch in model.matrix()")
        if (!identical(reorder, seq_len(ncol(data))))
            data <- data[, reorder, drop = FALSE]


    int <- attr(t, "response")

    contr.identity <- contr.treatment
    formals(contr.identity)$contrasts <- FALSE

    contrasts <- contrasts
    formals(contrasts)$contrasts <- FALSE

    if (length(data)) {

        contr.funs <-  c('contr.identity', 'contr.poly')
        namD <- names(data)
        for (i in namD) if (is.character(data[[i]])) {
            data[[i]] <- factor(data[[i]])
            warning(gettextf("variable '%s' converted to a factor",
                i), domain = NA)
        isF <- sapply(data, function(x) is.factor(x) || is.logical(x))
        isF[int] <- FALSE
        isOF <- sapply(data, is.ordered)
        for (nn in namD[isF]) if (is.null(attr(data[[nn]], "contrasts")))
            contrasts(data[[nn]]) <- contr.funs[1 + isOF[nn]]
        #    browser()
        if (!is.null(contrasts.arg) && is.list(contrasts.arg)) {
            if (is.null(namC <- names(contrasts.arg)))
                stop("invalid 'contrasts.arg' argument")
            for (nn in namC) {
                if (is.na(ni <- match(nn, namD)))
                  warning(gettextf("variable '%s' is absent, its contrast
will be ignored",
                    nn), domain = NA)
                else {
                  ca <- contrasts.arg[[nn]]
                  if (is.matrix(ca))
                    contrasts(data[[ni]], ncol(ca)) <- ca
                  else contrasts(data[[ni]]) <- contrasts.arg[[nn]]


    else {
        isF <- FALSE
        data <- list(x = rep(0, nrow(data)))


    ans <- .Internal(model.matrix(t, data))     cons <- if (any(isF))

        lapply(data[isF], function(x) attr(x, "contrasts"))     else NULL
    attr(ans, "contrasts") <- cons

        [[alternative HTML version deleted]]

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 18 May 2011 - 11:18:55 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 18 May 2011 - 13:00:07 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive