Conference Program | Australasian Applied Statistics Conference

Conference Abstract Book

PDF of the conference abstract book is available here: Abstract Book

Pre-Conference Workshop Schedule

PDF of the workshop schedule is available here: Workshop Schedule

Scientific Program

PDF of the scientific program is available here: AASC Scientific Program

Note, a welcome reception will be held on Monday evening at the Millennium Hotel

Full Program at a Glance [Tentative]

	Monday		Tuesday	Wednesday	Thursday	Friday
December	3		4	5	6	7
*Session*	*Workshops*		*Conference*
Morning	Chris Auld (Microsoft) Scaling R in the Cloud with doAzureParallel	Roger Payne (VSNi) Genstat 19ed masterclass	Invited Talk Chris Auld (Microsoft) Topic: Data analytics	Invited Talk Alison Smith (University of Wollongong) Topic: Design Tableau	Invited Talk Peter Baker (University of Queensland) Topic: Reproducible research	Contributed Talks
			*Morning tea*
			Contributed Talks	Contributed Talks	Invited Session Linley Jesson, Ruth Butler, Gabriela Borgognone and Helene Thygesen Topic: Statistical consultancy	Invited Talk Roger Payne (VSNi) Topic: 50 years of Genstat ANOVA
	*Lunch*
Afternoon	Peter Baker (University of Queensland) Tools for efficient data analysis workflow and reproducible research	Salvador Gezan (University of Florida) Modelling correlations between observations in agricultural and environmental sciences with ASReml-R	Invited Talk Robin Thompson (Rothamsted Research) Topic: History of REML over the last 50 years	Excursions	Contributed Talks
			*Afternoon tea*		*Afternoon tea*
			Contributed Talks		Invited Talk Salvador Gezan (University of Florida) Topic: ASREML-R
Social Events	Welcome Reception		Poster Session		Conference Dinner

Conference Themes

Big data analytics
Reproducible research
History of ANOVA and REML
Experimental design
Statistical consulting in the biosciences
Applied statistics!

Abstracts for Invited Talks

Topic: Design Tableau

Design Tableau: An aid to specifying the linear mixed model for a comparative experiment.

By Alison Smith and Brian Cullis. Centre for Bioinformatics and Biometrics, University of Wollongong.

The design and analysis of comparative experiments has changed dramatically in recent times. There has been a move away from text-book designs towards complex, non-orthogonal designs.

At the same time, analysis of variance (ANOVA) techniques have been superseded by the use of linear mixed models (LMM). The latter have the advantage of accommodating non-orthogonality and of allowing more complex variance models, which may be beneficial for improving the efficiency of treatment comparisons or for providing a more plausible structure. However, this flexibility has come at a cost, since in the transition from ANOVA to LMM, many practitioners overlook some of the fundamental principles in the analysis of comparative experiments. In order to address this we have developed “Design Tableau” (DT), which is a simple but general approach for specifying the LMM for a comparative experiment. The approach is based on the seminal work of John Nelder and Rosemary Bailey. It can accommodate a wide range of experiment types, including multi-environment trials, multi-phase experiments and experiments with longitudinal data. DT comprises a series of straight-forward steps aimed at removing the subjectivity in model specification and ensuring that the key principles in the analysis of comparative experiments are taken into consideration.

Topic: Reproducible research

Computing tools for a Don’t Repeat Yourself data analysis workflow and reproducible research

Is there a crisis in reproducible research? Some studies such as Ioannidis et al. (2009) have estimated that over fifty percent of published papers in some fields of research are not reproducible.

The data analysis cycle starts a lot earlier than many researchers appreciate. Planning study design and organising workflow before the first data are collected is important. Complex data analysis projects often consist of many steps that may be repeated any number of times. Like many statistical consultants, I have often found myself repeating the same steps when analysing data for different projects. Standardising your approach, reusing statistical software syntax, writing your own functions or procedures (and even incorporating them into R packages when using GNU R) improves efficiency and saves time. Employing computing tools like GNU Make to regenerate output when syntax or data files change, GNU git for version control, writing GNU R functions or packages for repetitive tasks and R Markdown for reproducible reporting. In addition to providing tools and strategies for data analysis projects, these Don’t Repeat Yourself (DRY) approaches also aid reproducible research and reporting.

Since the early 90s, I’ve employed version control systems and Make to project manage data analysis using GENSTAT, BUGS, SAS, R and other statistical packages. Also, as an early adopter of Sweave and R Markdown for reporting, I have found these approaches invaluable because, unlike the usual cut and paste approach, reports are reproducible. My overall strategy will be briefly described and illustrated. For GNU Make pattern rules, preliminary R packages and examples see https://github.com/petebaker

Biography:

Peter has worked as a statistical consultant and researcher in areas such as agricultural research, Bayesian methods for genetics, health, medical and epidemiological studies for thirty years. He is a Senior Lecturer in Biostatistics at the School of Public Health, University of Queenslanf where he also acts as a senior statistical collaborator and adviser to several research projects in the Faculty of Medicine.

Topic: ASREML-R

Getting the most of my mixed model (and specially ASReml): applications in quantitative genetics and breeding

Quantitative genetic analyses have benefited from the increase in computational power and access to better and more efficient algorithms. Over the last few years, the increasing availability of large amounts of phenotypic and molecular data has allowed us to perform completer and more complex genetical analyses that exploit better the potential of the linear mixed model (LMM) framework with flexible variance-covariance structures. ASReml has been a particularly important tool to face many of these problems for many public and private breeding programs.

In this presentation, we will present several illustrations of the implementation of complex problems that require fitting LMM to support important operational decisions for many genetic improvement programs in agriculture, forestry and aquaculture. These cases will focus on the use and incorporation of large pedigree- and molecular-based relationship matrices for multi-trait and multi-environment analyses. Further extensions of the use of genomic selection (GS) on the estimation of additive and dominant effects also will be presented. Finally, the importance of considering these genetic relationships, in the context of design and analyses of experiments, is also illustrated in the context of augmented designs.

Topic: 50 years of Genstat ANOVA

50 Years of Genstat ANOVA

Graham Wilkinson’s ANOVA algorithm has been a key part of Genstat ever since its early days. It was one of the motivations for the creation of Genstat. It was also the reason why I myself originally became involved with Genstat – initially to take up the responsibility for ANOVA ready for Graham’s departure from Rothamsted in 1974. The algorithm provides a very efficient method of analysis that, even after more than 50 years, is unmatched elsewhere.

Analysis of variance and the associated design of experiments had been a Rothamsted speciality since the establishment of the Statistics Department under Sir Ronald Fisher in 1919. This was not a merely theoretical interest, but was motivated by the many experiments that needed to be designed, and then analysed, for the Rothamsted biologists. Fisher retired to Adelaide, and had a strong influence on Graham’s statistical views. The Rothamsted connection was strengthened in 1965, when John Nelder visited the WAITE Institute in Adelaide, and began his collaboration with Graham. This laid the foundations for Genstat. Work on Genstat began in earnest, when John was appointed as head of the Rothamsted Statistics Department in 1968, and Graham joined the Department in 1971.

The original ANOVA algorithm was described by Wilkinson (1970), and its theoretical underpinnings by James & Wilkinson (1971). Payne & Wilkinson (1977) described the more efficient method for determining the structure of the design, that was my first task to get working when I took over. The relationship between the first-order balanced designs, that ANOVA analyses, and Nelder’s (1965) general balance was explained by Payne & Tobias (1992), together with their algorithm that extended ANOVA, to estimate variance components and calculate estimates of treatment effects that combine information from all the strata in the design. Payne (2004) described how to obtain degrees of freedom for these combined effects.

The algorithm involves a sequence of sweeps in which effects are estimated, and then removed, from a working variate. There is also special sweep, known as a pivot, that projects the working variate into a specific stratum of the design. Matrix inversion is thus required only for the estimation of covariate regression coefficients and, as a result, the algorithm is very efficient in its use of workspace and computing time. Even 50 years on, this remains an important consideration.

References

James, A.T. & Wilkinson, G.N. (1971). Factorisation of the residual operator and canonical decomposition of non-orthogonal factors in analysis of variance. Biometrika, 58, 279-294.

Nelder, J.A. (1965). The analysis of randomized experiments with orthogonal block structure. I Block structure and the null analysis of variance. II Treatment structure and the general analysis of variance. Proceedings of the Royal Society, Series A, 283, 147-178.

Payne, R.W. & Wilkinson, G.N. (1977). A general algorithm for analysis of variance. Applied Statistics, 26, 251-260.

Payne, R.W. & Tobias, R.D. (1992). General balance, combination of information and the analysis of covariance. Scandinavian Journal of Statistics, 19, 3-23.

Payne, R.W. (2004). Confidence intervals and tests for contrasts between combined effects in generally balanced designs. COMPSTAT 2004 Proceedings in Computational Statistics, 1629-1636. Physica-Verlag, Heidelberg.

Wilkinson, G.N. (1970). A general recursive algorithm for analysis of variance. Biometrika, 57, 19-46.

Topic: Data analytics

Large Scale & Real Time Data Analytics

Computing capacity and storage capacity of exploded in recent years. This has allowed us to deal with larger datasets and in more recent years focus has moved to approaches for processing data in near real time. In this session we’ll discuss approaches to large analytics jobs that we run inside Microsoft and LinkedIn as well as learnings taken from engagements with customers globally.

Topic: History of REML over the last 50 years

Desert island papers – a life in variance parameter and quantitative genetic parameters estimation reviewed using ten papers

When I was recently asked to review the history of REML I thought of adapting the device used in Desert Island discs when ‘castaways’ are invited to discuss their life and suggest eight record tracks they would like if stranded on a desert island. I have replaced eight discs by ten papers to make a more coherent story. This includes discussion of the initial motivation for REML, links with other methods, the development of the Average Information algorithm, the development of computer software and I will end with some open questions.

Topic: Statistical consultancy

Consulting in the real world: Communicating statistics to scientists

For this special session, there will be four 15 min talks followed by a discussion. To assist the discussion, we intend to use an internet tool (evpoll), which will allow anonymous comments, with voting on those comments. So, please bring along your smart phones, tablets or laptops!

“From cradle to grave”: making an impact from conception to publishing

Ruth Butler

Biological research has statistical aspects from initial conception through all stages to final publication and sometimes beyond. Good foundations – trial planning and design – are the bed-rock for quality science results, thus requiring good statistical input from the very beginning. Over my time as a biometrician, I have increasingly been involved at all stages of a project, developing approaches that facilitate that, leading to increases in efficiency and efficacy of the research through good statistical practice. I will present some approaches that I use, from trial design, to data management, through to presentation of results.

Statistical inference and management decisions

Helene Thygesen

When working for a bookmaker, the role of the statistical inference in the management decision process is fairly straight forward. We are asked to identify the strategy that maximizes expected profits – maybe taking risk averseness into account. Identifying similar objectives when working for government or health care organizations can feel like opening a can of worm – different stakeholders will have different objectives, and there will be legal or cultural barriers to non-standard statistical approaches. Nevertheless, I will argue that one should generally make an effort to operationalize management objectives. In this talk I will give examples from biosecurity, health care, criminal justice and wild life protection to illustrate how the statistician can try to adapt the type of statistical inference to the needs of different sectors.

The view from the other side: a biologist’s view of communicating statistics

Linley Jesson

Biological research has changed immensely in the last twenty years and for most researchers handling and interpreting data has become their mainstay activities. Asking good questions in biology requires a fundamental understanding of the philosophy of science, of which statistics is the key underpinning. I will outline some fundamental statistical concepts that I argue all biologists should have a clear understanding of, and discuss ways in which we can increase the integration of statistics and biology.

From heaven to hell… and how to find a way back!

Gabriela Borgognone

In many cases, the statistician is considered an integral part of the research team, who helps define clear objectives and ensures that correct statistical procedures are used and sound inferences are made. In many other cases, the statistician is seen as an outsider, a service provider, or someone that simply crunches the numbers at the end. Then, the challenge for the statistician is to slowly educate the researchers on the importance of statistics in science, earn their trust, and skillfully become an integral part of their projects.

Conference Themes

Abstracts for Invited Talks

Consulting in the real world: Communicating statistics to scientists

All rights reserved - The University of Auckland | Accessibility | Copyright | Privacy | Disclaimer