Scientific Program
(Tentative) Program at a Glance
Monday 
Tuesday 
Wednesday 
Thursday 
Friday 

December 
3 
4 
5 
6 
7 

Session 
Conference 

Morning 
Chris Auld Scaling R in the Cloud with doAzureParallel 
Roger Payne Genstat 19ed masterclass 
Invited talk Chris Auld Topic: Data 
Invited talk Alison Smith Topic: Design 
Invited talk Peter Baker Topic: Reproducible 
Contributed Talks 
Morning tea 

Contributed Talks 
Contributed Talks 
Invited Session Linley Jesson, Ruth Butler, Gabriela Borgognone and Helene Thygesen Topic: Statistical 
Invited Talk Roger Payne Topic: 50 years 


Lunch 

Afternoon 
Peter Baker Tools for efficient 
Salvador Gezan Modelling correlations between observations in agricultural and environmental sciences with ASRemlR 
Invited talk Robin Thompson Topic: History 
Contributed Talks 

Afternoon tea 
Afternoon tea 

Contributed Talks 
Invited Talk Salvador Gezan Topic: ASREMLR 

Social Events 
Welcome Reception 

Poster Session 
Conference Themes
 Big data analytics
 Reproducible research
 History of ANOVA and REML
 Experimental design
 Statistical consulting in the biosciences
 Applied statistics!
Abstracts for Invited Talks
 Alison Smith
 Peter Baker
 Salvador Gezan
 Roger Payne
 Chris Auld
 Robin Thompson
 Linley Jesson, Ruth Butler, Gabriela Borgognone, Helene Thygesen
Topic: Design Tableau
Design Tableau: An aid to specifying the linear mixed model for a comparative experiment.
By Alison Smith and Brian Cullis. Centre for Bioinformatics and Biometrics, University of Wollongong.
The design and analysis of comparative experiments has changed dramatically in recent times. There has been a move away from textbook designs towards complex, nonorthogonal designs.
At the same time, analysis of variance (ANOVA) techniques have been superseded by the use of linear mixed models (LMM). The latter have the advantage of accommodating nonorthogonality and of allowing more complex variance models, which may be beneficial for improving the efficiency of treatment comparisons or for providing a more plausible structure. However, this flexibility has come at a cost, since in the transition from ANOVA to LMM, many practitioners overlook some of the fundamental principles in the analysis of comparative experiments. In order to address this we have developed “Design Tableau” (DT), which is a simple but general approach for specifying the LMM for a comparative experiment. The approach is based on the seminal work of John Nelder and Rosemary Bailey. It can accommodate a wide range of experiment types, including multienvironment trials, multiphase experiments and experiments with longitudinal data. DT comprises a series of straightforward steps aimed at removing the subjectivity in model specification and ensuring that the key principles in the analysis of comparative experiments are taken into consideration.
Topic: Reproducible research
Computing tools for a Don’t Repeat Yourself data analysis workflow and reproducible research
Is there a crisis in reproducible research? Some studies such as Ioannidis et. el (2009) have estimated that over fifty percent of published papers in some fields of research are not reproducible.
The data analysis cycle starts a lot earlier than many researchers appreciate. Planning study design and organising workflow before the first data are collected is important. Complex data analysis projects often consist of many steps that may be repeated any number of times. Like many statistical consultants, I have often find myself repeating the same steps when analysing data for different projects. Standardising your approach, reusing statistical software syntax, writing our own functions or procedures (and even incorporating them into R packages when using GNU R) improves efficiency and saves time. Employing computing tools like GNU Make to regenerate output when syntax or data files change, GNU git for version control, writing GNU R functions or packages for repetitive tasks and R Markdown reproducible reporting. In addition to providing tools and strategies for data analysis projects, these Don’t Repeat Yourself (DRY) approaches also aid reproducible research and reporting.
Since the early 90s, I’ve employed version control systems and Make to project manage data analysis using GENSTAT, BUGS, SAS, R and other statistical packages. Also, as an early adopter of Sweave and R Markdown for reporting, I have found these approaches invaluable because, unlike the usual cut and paste approach, reports are reproducible. My overall strategy will be briefly described and illustrated. For GNU Make pattern rules, preliminary R packages and examples see https://github.com/petebaker
Biography:
Peter has worked as a statistical consultant and researcher in areas such as agricultural research, Bayesian methods for genetics, health, medical and epidemiological studies for thirty years. He is a Senior Lecturer in Biostatistics at the School of Public Health, University of Queenslanf where he also acts as a senior statistical collaborator and adviser to several research projects in the Faculty of Medicine.
Topic: ASREMLR
Getting the most of my mixed model (and specially ASReml): applications in quantitative genetics and breeding
Topic: 50 years of Genstat ANOVA
50 Years of Genstat ANOVA
Graham Wilkinson’s ANOVA algorithm has been a key part of Genstat ever since its early days. It was one of the motivations for the creation of Genstat. It was also the reason why I myself originally became involved with Genstat – initially to take up the responsibility for ANOVA ready for Graham’s departure from Rothamsted in 1974. The algorithm provides a very efficient method of analysis that, even after more than 50 years, is unmatched elsewhere.
Analysis of variance and the associated design of experiments had been a Rothamsted speciality since the establishment of the Statistics Department under Sir Ronald Fisher in 1919. This was not a merely theoretical interest, but was motivated by the many experiments that needed to be designed, and then analysed, for the Rothamsted biologists. Fisher retired to Adelaide, and had a strong influence on Graham’s statistical views. The Rothamsted connection was strengthened in 1965, when John Nelder visited the WAITE Institute in Adelaide, and began his collaboration with Graham. This laid the foundations for Genstat. Work on Genstat began in earnest, when John was appointed as head of the Rothamsted Statistics Department in 1968, and Graham joined the Department in 1971.
The original ANOVA algorithm was described by Wilkinson (1970), and its theoretical underpinnings by James & Wilkinson (1971). Payne & Wilkinson (1977) described the more efficient method for determining the structure of the design, that was my first task to get working when I took over. The relationship between the firstorder balanced designs, that ANOVA analyses, and Nelder’s (1965) general balance was explained by Payne & Tobias (1992), together with their algorithm that extended ANOVA, to estimate variance components and calculate estimates of treatment effects that combine information from all the strata in the design. Payne (2004) described how to obtain degrees of freedom for these combined effects.
The algorithm involves a sequence of sweeps in which effects are estimated, and then removed, from a working variate. There is also special sweep, known as a pivot, that projects the working variate into a specific stratum of the design. Matrix inversion is thus required only for the estimation of covariate regression coefficients and, as a result, the algorithm is very efficient in its use of workspace and computing time. Even 50 years on, this remains an important consideration.
References
James, A.T. & Wilkinson, G.N. (1971). Factorisation of the residual operator and canonical decomposition of nonorthogonal factors in analysis of variance. Biometrika, 58, 279294.
Nelder, J.A. (1965). The analysis of randomized experiments with orthogonal block structure. I Block structure and the null analysis of variance. II Treatment structure and the general analysis of variance. Proceedings of the Royal Society, Series A, 283, 147178.
Payne, R.W. & Wilkinson, G.N. (1977). A general algorithm for analysis of variance. Applied Statistics, 26, 251260.
Payne, R.W. & Tobias, R.D. (1992). General balance, combination of information and the analysis of covariance. Scandinavian Journal of Statistics, 19, 323.
Payne, R.W. (2004). Confidence intervals and tests for contrasts between combined effects in generally balanced designs. COMPSTAT 2004 Proceedings in Computational Statistics, 16291636. PhysicaVerlag, Heidelberg.
Wilkinson, G.N. (1970). A general recursive algorithm for analysis of variance. Biometrika, 57, 1946.
Topic: Data analytics
Topic: History of REML over the last 50 years
Topic: Statistical consultancy
Consulting in the real world: Communicating statistics to scientists
The view from the other side: a biologist’s view of communicating statistics
Linley Jesson
Biological research has changed immensely in the last twenty years and for most researchers handling and interpreting data has become their mainstay activities. Asking good questions in biology requires a fundamental understanding of the philosophy of science, of which statistics is the key underpinning. I will outline some fundamental statistical concepts that I argue all biologists should have a clear understanding of, and discuss ways in which we can increase the integration of statistics and biology.
“From cradle to grave”: making an impact from conception to publishing
Ruth Butler
Biological research has statistical aspects from initial conception through all stages to final publication and sometimes beyond. Good foundations – trial planning and design – are the bedrock for quality science results, thus requiring good statistical input from the very beginning. Over my time as a biometrician, I have increasingly been involved at all stages of a project, developing approaches that facilitate that, leading to increases in efficiency and efficacy of the research through good statistical practice. I will present some approaches that I use, from trial design, to data management, through to presentation of results.
From heaven to hell… and how to find a way back!
Gabriela Borgognone
In many cases, the statistician is considered an integral part of the research team, who helps define clear objectives and ensures that correct statistical procedures are used and sound inferences are made. In many other cases, the statistician is seen as an outsider, a service provider, or someone that simply crunches the numbers at the end. Then, the challenge for the statistician is to slowly educate the researchers on the importance of statistics in science, earn their trust, and skillfully become an integral part of their projects.
Statistical inference and management decisions
Helene Thygesen
When working for a bookmaker, the role of the statistical inference in the management decision process is fairly straight forward. We are asked to identify the strategy that maximizes expected profits – maybe taking risk averseness into account. Identifying similar objectives when working for government or health care organizations can feel like opening a can of worm – different stakeholders will have different objectives, and there will be legal or cultural barriers to nonstandard statistical approaches. Nevertheless, I will argue that one should generally make an effort to operationalize management objectives. In this talk I will give examples from biosecurity, health care, criminal justice and wild life protection to illustrate how the statistician can try to adapt the type of statistical inference to the needs of different sectors.