Detail

Publication date: 1 de June, 2021

An evidence-based approach to the validation of Software Engineering claims

Evidence-Based Software Engineering is a paradigm that supports arguments concerning the suitability, limits, costs, and risks, inherent to Software Engineering tools and techniques, with experimental evidence. Our goal should be to measure the extent to which new proposals improve existing ones. To produce the required evidences, we need to use Experimental Software Engineering (ESE) techniques to design and perform experiments in an independently comparable and replicable way. The experimental validation of claims is a useful tool to help decision makers scrutinizing alternative solutions to Software Engineering problems.

Evidence collected from the main scientific publications in Software Engineering shows that although the community clearly acknowledges the need for the experimental validation of claims, it consistently fails to produce such validation, particularly with the comparability and replicability attributes. In fact, achieving comparability and replicability of experimental results are among the main challenges in ESE. They are key attributes for allowing a much-needed meta-analysis of results obtained from several independent experiments. In other, more mature, sciences, such meta-analysis is standard (e.g. consider the clinical trials process for the introduction of new drugs in the market).

In this talk, we will present and discuss a process model for conducting experimental work in Software Engineering, which is aimed at solving this problem. We use UML diagrams for describing the process model. We specify the dynamic part of the model through activity diagrams, and model the most relevant concepts with class diagrams. The presented process model conforms to current proposals of standard experimental reporting guidelines. It is useful both when conducting experimental work and as a framework for assessing the work of other authors. The process model has been successfully followed by seasoned and novice experimenters.

Throughout the presentation, we will illustrate the whole process by instantiating it with an observational study conducted on the reuse of Eclipse plug-ins. This will allow us to discuss with some depth two frequent problems with experimental reports: the rigorous definition of collected measurements, and the systematic identification of the threats to the validity of the described experiments. Both are key features for achieving the desired comparability and replicability of experimental work.

Presenter