Rouxléne van der Merwe
Senior lecturer at Plant Breeding, University of the Free State

Most of my final year Plant Breeding and postgraduate Plant Sciences students are so frightened of statistics that they would rather bail out from enrolling for this course. However, it became compulsory during the last few years for students from all divisions at our Department of Plant Sciences. This is because not only do they need to be able to statistically analyse and interpret their own experimental data, but also to ensure correct planning of their experiments ahead of commencing with the project. Many researchers across all fields of agriculture and natural sciences do not know that a research trial or experiment need to be statistically designed in order to be analysed in a specific manner. Many times people would come to me with their own data and then request my assistance in data analyses but only after the experiment is completed. A few questions from my side could many times not be answered. For example; “What was your hypothesis or research aim? What trial design did you use to plant your pots in the glasshouse or plant rows in the field trial? Where are your replications column?”. By not answering any one of these questions, “I am unable to assist you in statistically analysing your data.” No research experiment or trial can be conducted before a specific aim was given and the trial design was scientifically planned. The aim cannot be realised if the correct statistical analyses were not done, nor can the correct statistical analyses be done unless the trial was designed for the data to be analysed in that manner. See, this is why statistics is important, from trial design during planning until the end of the experiment in order to derive valid conclusions from the data collected. Knowing why it is important in research, we need to understand what statistics involve. Statistics is concerned with scientific methods for collecting, organizing, presenting and analysing data. Not only is valid conclusions derived but reasonable decisions can only be made based on data analysis. Plant breeding is known as a “numbers game” and thus statistics should be central to the systematic collection of numerical data and its interpretation. During trial planning and design, the “where”, “what”, “how” and “why” questions need to be answered. Where will the trial be planted, what treatments and their levels will be included, how and when will the data be collected and why is this experiment conducted? How much space do I need to plant a replicated trial or experiment since at least two replications are required for trial planning and statistical analysis. Many trial designs are available, depending on the number of treatments, their combinations and if they are random or fixed. In addition, the homogeneity of the trial area (e.g. soil), as determined by the size of the field plan, should be considered. The data collected on your variables (plant characteristics) should mainly be quantitative in order to be subjected to parametric tests such as analysis of variance. Finally, taking all decisions made in trial planning and execution into account, I quote one of our professors: “Your statistical analyses are only as good as your measurements and that no amount of statistics can correct the absence of an aim, poor trial design, poor data, incorrect measurements and incorrect variables. Poor statistical procedures and incorrect assumptions when applying these procedures can make a mockery of good data”.