Scientific Program

(Tentative) Program at a Glance

Monday

Tuesday

Wednesday

Thursday

Friday

December

3

4

5

6

7

Session

Workshops

Conference

Morning

Chris Auld
(Microsoft)

Scaling R in the Cloud with doAzureParallel

Roger Payne
(VSNi)

Genstat 19ed masterclass

Invited talk

Chris Auld
(Microsoft)

Topic: Data
analytics

Invited talk

Alison Smith
(University of Wollongong)

Topic: Design
Tableau

Invited talk

Peter Baker
(University of Queensland)

Topic:  Reproducible
research

Contributed Talks

Morning tea

Contributed Talks

Contributed Talks

Invited Session

Linley Jesson, Ruth Butler, Gabriela Borgognone and Helene Thygesen

Topic: Statistical
consultancy

Invited Talk

Roger Payne
(VSNi)

Topic: 50 years
of Genstat ANOVA

 

Lunch

Afternoon

Peter Baker
(University of Queensland)

Tools for efficient
data analysis workflow and reproducible research

Salvador Gezan
(University of Florida)

Modelling correlations between observations in agricultural and environmental sciences with ASReml-R

Invited talk

Robin Thompson
(Rothamsted Research)

Topic: History
of REML over the last 50 years

Excursions

Contributed Talks

Afternoon tea

Afternoon tea

Contributed Talks

Invited Talk

Salvador Gezan
(University of Florida)

Topic: ASREML-R

Social Events

Welcome Reception

 

Poster Session

Conference Dinner

 

Conference Themes

  • Big data analytics
  • Reproducible research
  • History of ANOVA and REML
  • Experimental design
  • Statistical consulting in the biosciences
  • Applied statistics!

Abstracts for Invited Talks

Topic: Design Tableau

Design Tableau: An aid to specifying the linear mixed model for a comparative experiment.

By Alison Smith and Brian Cullis. Centre for Bioinformatics and Biometrics, University of Wollongong.

The design and analysis of comparative experiments has changed dramatically in recent times. There has been a move away from text-book designs towards complex, non-orthogonal designs.

At the same time, analysis of variance (ANOVA) techniques have been superseded by the use of linear mixed models (LMM). The latter have the advantage of accommodating non-orthogonality and of allowing more complex variance models, which may be beneficial for improving the efficiency of treatment comparisons or for providing a more plausible structure. However, this flexibility has come at a cost, since in the transition from ANOVA to LMM, many practitioners overlook some of the fundamental principles in the analysis of comparative experiments. In order to address this we have developed “Design Tableau” (DT), which is a simple but general approach for specifying the LMM for a comparative experiment.  The approach is based on the seminal work of John Nelder and Rosemary Bailey. It can accommodate a wide range of experiment types, including   multi-environment trials, multi-phase experiments and experiments with longitudinal data. DT comprises a series of straight-forward steps aimed at removing the subjectivity in model specification and ensuring that the key principles in the analysis of comparative experiments are taken into consideration.

Topic: Reproducible research

Computing tools for a Don’t Repeat Yourself data analysis workflow and reproducible research

Is there a crisis in reproducible research? Some studies such as Ioannidis et. el (2009) have estimated that over fifty percent of published papers in some fields of research are not reproducible.

The data analysis cycle starts a lot earlier than many researchers appreciate. Planning study design and organising workflow before the first data are collected is important. Complex data analysis projects often consist of many steps that may be repeated any number of times. Like many statistical consultants, I have often find myself repeating the same steps when analysing data for different projects. Standardising your approach, reusing statistical software syntax, writing our own functions or procedures (and even incorporating them into R packages when using GNU R) improves efficiency and saves time. Employing computing tools like GNU Make to regenerate output when syntax or data files change, GNU git for version control, writing GNU R functions or packages for repetitive tasks and R Markdown reproducible reporting. In addition to providing tools and strategies for data analysis projects, these Don’t Repeat Yourself (DRY) approaches also aid reproducible research and reporting.

Since the early 90s, I’ve employed version control systems and Make to project manage data analysis using GENSTAT, BUGS, SAS, R and other statistical packages. Also, as an early adopter of Sweave and R Markdown for reporting, I have found these approaches invaluable because, unlike the usual cut and paste approach, reports are reproducible. My overall strategy will be briefly described and illustrated. For GNU Make pattern rules, preliminary R packages and examples see https://github.com/petebaker

Biography:

Peter has worked as a statistical consultant and researcher in areas such as agricultural research, Bayesian methods for genetics, health, medical and epidemiological studies for thirty years. He is a Senior Lecturer in Biostatistics at the School of Public Health, University of Queenslanf where he also acts as a senior statistical collaborator and adviser to several research projects in the Faculty of Medicine.

Topic: ASREML-R

Getting the most of my mixed model (and specially ASReml): applications in quantitative genetics and breeding

 

 

Topic: 50 years of Genstat ANOVA

50 Years of Genstat ANOVA

Graham Wilkinson’s ANOVA algorithm has been a key part of Genstat ever since its early days. It was one of the motivations for the creation of Genstat. It was also the reason why I myself originally became involved with Genstat – initially to take up the responsibility for ANOVA ready for Graham’s departure from Rothamsted in 1974. The algorithm provides a very efficient method of analysis that, even after more than 50 years, is unmatched elsewhere.

Analysis of variance and the associated design of experiments had been a Rothamsted speciality since the establishment of the Statistics Department under Sir Ronald Fisher in 1919. This was not a merely theoretical interest, but was motivated by the many experiments that needed to be designed, and then analysed, for the Rothamsted biologists. Fisher retired to Adelaide, and had a strong influence on Graham’s statistical views. The Rothamsted connection was strengthened in 1965, when John Nelder visited the WAITE Institute in Adelaide, and began his collaboration with Graham. This laid the foundations for Genstat. Work on Genstat began in earnest, when John was appointed as head of the Rothamsted Statistics Department in 1968, and Graham joined the Department in 1971.

The original ANOVA algorithm was described by Wilkinson (1970), and its theoretical underpinnings by James & Wilkinson (1971). Payne & Wilkinson (1977) described the more efficient method for determining the structure of the design, that was my first task to get working when I took over. The relationship between the first-order balanced designs, that ANOVA analyses, and Nelder’s (1965) general balance was explained by Payne & Tobias (1992), together with their algorithm that extended ANOVA, to estimate variance components and calculate estimates of treatment effects that combine information from all the strata in the design. Payne (2004) described how to obtain degrees of freedom for these combined effects.

The algorithm involves a sequence of sweeps in which effects are estimated, and then removed, from a working variate. There is also special sweep, known as a pivot, that projects the working variate into a specific stratum of the design. Matrix inversion is thus required only for the estimation of covariate regression coefficients and, as a result, the algorithm is very efficient in its use of workspace and computing time. Even 50 years on, this remains an important consideration.

References

James, A.T. & Wilkinson, G.N. (1971). Factorisation of the residual operator and canonical decomposition of non-orthogonal factors in analysis of variance. Biometrika, 58, 279-294.

Nelder, J.A. (1965). The analysis of randomized experiments with orthogonal block structure. I Block structure and the null analysis of variance. II Treatment structure and the general analysis of variance. Proceedings of the Royal Society, Series A, 283, 147-178.

Payne, R.W. & Wilkinson, G.N. (1977). A general algorithm for analysis of variance. Applied Statistics, 26, 251-260.

Payne, R.W. & Tobias, R.D. (1992). General balance, combination of information and the analysis of covariance. Scandinavian Journal of Statistics, 19, 3-23.

Payne, R.W. (2004). Confidence intervals and tests for contrasts between combined effects in generally balanced designs. COMPSTAT 2004 Proceedings in Computational Statistics, 1629-1636. Physica-Verlag, Heidelberg.

Wilkinson, G.N. (1970). A general recursive algorithm for analysis of variance. Biometrika, 57, 19-46.

Topic: Data analytics

Topic: History of REML over the last 50 years

Topic: Statistical consultancy

Consulting in the real world: Communicating statistics to scientists

The view from the other side: a biologist’s view of communicating statistics

Linley Jesson

Biological research has changed immensely in the last twenty years and for most researchers handling and interpreting data has become their mainstay activities. Asking good questions in biology requires a fundamental understanding of the philosophy of science, of which statistics is the key underpinning. I will outline some fundamental statistical concepts that I argue all biologists should have a clear understanding of, and discuss ways in which we can increase the integration of statistics and biology.

“From cradle to grave”: making an impact from conception to publishing

Ruth Butler

Biological research has statistical aspects from initial conception through all stages to final publication and sometimes beyond. Good foundations – trial planning and design – are the bed-rock for quality science results, thus requiring good statistical input from the very beginning. Over my time as a biometrician, I have increasingly been involved at all stages of a project, developing approaches that facilitate that, leading to  increases in efficiency and efficacy of the research through good statistical practice. I will present some approaches that I use, from trial design, to data management, through to presentation of results.

From heaven to hell… and how to find a way back!

Gabriela Borgognone

In many cases, the statistician is considered an integral part of the research team, who helps define clear objectives and ensures that correct statistical procedures are used and sound inferences are made. In many other cases, the statistician is seen as an outsider, a service provider, or someone that simply crunches the numbers at the end. Then, the challenge for the statistician is to slowly educate the researchers on the importance of statistics in science, earn their trust, and skillfully become an integral part of their projects.

Statistical inference and management decisions

Helene Thygesen

When working for a bookmaker, the role of the statistical inference in the management decision process is fairly straight forward. We are asked to identify the strategy that maximizes expected profits – maybe taking risk averseness into account. Identifying similar objectives when working for government or health care organizations can feel like opening a can of worm – different stakeholders will have different objectives, and there will be legal or cultural barriers to non-standard statistical approaches. Nevertheless, I will argue that one should generally make an effort to operationalize management objectives. In this talk I will give examples from biosecurity, health care, criminal justice and wild life protection to illustrate how the statistician can try to adapt the type of statistical inference to the needs of different sectors.