August 8–10. Iowa State University, Ames, Iowa
Develop a package useful for the analysis of large data sets.
Your package could augment an existing package like biglm, for example, by providing graphical or numerical diagnostic tools, or by adding support methods for transparently handling data from a database. Alternatively, you could develop a package for fitting linear or generalised linear models to large data sets that takes an entirely different approach.
Deadline: June 30.
You will be expected to submit a complete R package, suitable for upload to CRAN (i.e. it should pass R CMD check). Your package should include as a vignette a paper describing your approach, illustrating its use, and explaining how it will scale to handling data sets larger than memory (or possibly larger than your disk). You should reference the relevant statistical literature.
Don't forget to include the names, affiliations and email address of everyone who contributed to the entry in the DESCRIPTION file.
Please submit your entry, via email, to [email protected]. In your email you should include a statement affirming that the submission is all your work, and has been completed specifically for the useR 2007 programming competition.
If you have any questions, please contact Hadley Wickham, [email protected].
The judging committee is made up of:
and we will be judging the entries based on the following criteria:
The cash prize will be shared equally among all participants in the winning entry. In the event of a tie for first place the winnings will first be divided equally among the winning projects and then shared equally among participants in each project.