[R] Data Mining Competition 2008

From: Xiaogang Su <xiaosu_at_mail.ucf.edu>
Date: Fri, 15 Feb 2008 16:03:50 -0500

Dear Colleagues, please help distribute the following announcement of a data mining competition. Thanks, -XG

Data Mining Competition 2008

Website: http://dms.stat.ucf.edu/competition08/home.htm

Department of Statistics & Actuarial Science University of Central Florida

ANNOUNCEMENT The Data Mining program at the University of Central Florida (UCF) is announcing a data mining competition on marketing response analysis in collaboration with BlueCross BlueShields of Florida (BCBSFL). The purpose of this project is to
develop a predictive model the can generate a list of potential responders in a future promotion mailing campaign. The response/target variable is 0-1 binary with value1 indicating a response in the previous mail campaign. Most of the
explanatory variables or inputs used in this study are from census data and the rest are from a list data vendor. We have renamed all input variables as X1, X2, ... for data security and privacy concerns.

DATASET DOWNLOADS Two formats of the datasets are made available: SAS formatted and comma-separated values (CSV). Please select the one that serves best to your convenience after registration.

Register to Download Dataset

SAS        training.sas7bdat (392.53 mb)                 test.sas7bdat 

(43.89 mb)
CSV training.csv (257.00 mb) test.csv

(28.55 mb)

PARTICIPATION AND AWARDS This competition is open to anyone interested. Please review the following rules carefully and contact us with any questions at data.mining.2008_at_gmail.com.

Please build your model using the training data set and accordingly obtain your predicted probability of response for each individual in the test sample. Two deliverables must be submitted by 5:00 pm (Eastern Time) on 3/31/2008 in order to participate in the contest.

— A data set with two columns: one is ID and the other is your predicted probabilities of response (not 0-1 predicted outcomes).

— A one-page write-up that contains your contact information and a brief description of your modeling methods and approaches. The contact information should list the names, titles, academic degrees, affiliations, and locations (city, state, and country, if international) of all authors.

The top three winners will be selected according to predicted probabilities on the test sample data. All participants will be ranked using the following two specific model performance measures.

— Criterion 1: area under the receiver operating characteristic
(ROC) curve.

— Criterion 2: percentage of responders caught among the first 10,000 individuals with highest prediction response probabilities.

Then the final ranking will be the sum of these two separate ranks. In the case of ties (e.g., Tom has got No.1 in terms of Criterion 1 and No.3 in terms of Criterion 2, while Jerry has got No. 2 with both criteria), the one with higher rank in terms of Criterion 1 (i.e., Tom) would win out.

All sponsored by BLBSFL, a cash prize of $1,000 will be awarded to the best performer; $500 for the second and $250 for the third. The three winning individuals or teams will also be invited to present their results at the Fourth Annual Business Intelligence Symposium in Orlando, FL on April 11, 2008. Award plates will be presented to the winners during the symposium. The work can
be completed by an individual or group, but only one individual will be invited to present their work at the Symposium for a winning team.


Feburuary 08, 2008       Competition Announced
March 31, 2008              Submissions for Competition by 5:00 pm

(Eastern Time)
April 02, 2008 Announcement of Winners April 11-12, 2008 Fourth Annual Business Intelligence
Symposium in Orlando, FL

Xiaogang Su, Ph.D.
Associate Professor / Undergraduate Coordinator Department of Statistics and Actuarial Science University of Central Florida
Orlando, FL 32816
(407) 823-2940 [O]


R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 15 Feb 2008 - 21:09:22 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 15 Feb 2008 - 21:30:14 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive