In February 2016 science hit the news again: the merger of a binary black hole system was detected by the Advanced LIGO twin instruments, one in Hanford, Washington, and the other 3,000 km away in Livingstone, Louisiana, USA. The signal, detected in September 2015, was famously predicted by Einstein’s general theory of relativity. This phenomenon was also numerically modelled by super-computers since at least 2005 – a typical example of computational experiment.
The term ‘computational experiment’ is used to refer to a computer simulation of a real scientific experiment. An easier example: to test some macroscopic property of a liquid which is hard to obtain, or where equipment is too expensive to purchase e.g. in an educational setting, a simulation is a more feasible solution than the real experiment. Computational experiments are largely used in several disciplines like chemistry, biology and the social sciences. As experiments are the essence of scientific methodology, indirectly, computer simulations raise interesting questions: how do computational experiments affect results in the other sciences? And what kind of scientific method do computational experiments support?
These questions highlight the much older problem of the status and methodology of computer science (CS) itself. Today we are acquainted with CS as a well-established discipline. Given the pervasiveness of computational artefacts in everyday life, we can even consider computing a major actor in academic, scientific and social contexts. But the status enjoyed today by CS has not always been granted. CS, since its early days, has been a minor god. At the beginning computers were instruments for the ‘real sciences’: physics, mathematics, astronomy needed to perform calculations that had reached levels of complexity unfeasible for human agents.
Computers were also instruments for social and political aims: the US army used them to compute ballistic tables and, notoriously, mechanical and semi-computational methods were at work in solving cryptographic codes during the Second World War.
The UK and the US were pioneers in the transformation that brought CS into the higher education system: the first degree in CS was established at the University of Cambridge Computer Laboratory in 1953 by the mathematics faculty to meet the request of competencies in mechanical computation applied to scientific research. It was followed by Purdue University in 1962. The academic birth of CS is thus the result of creating technical support for other sciences, rather than the acknowledgement of a new science. Subsequent decades brought forth a quest for the scientific status of this discipline. The role of computer experiments as they are used to support results in other sciences, a topic which has been largely investigated, seems to perpetrate this ancillary role of computing.
But what is then the scientific value of computational experiments? Can they be used to assert that computing is a scientific discipline on its own? The natural sciences have a codified investigation method: a problem is identified; a predictable and testable hypothesis is formulated; a study to test the hypothesis is devised; analyses are performed and results of the test are evaluated; on their basis, the hypothesis and the tests are modified and repeated; finally, a theory that answers positively or negatively to the hypothesis is formulated. One important consideration is therefore the applicability of the so-called hypothetical-deductive method to CS. This, in turn, hides several smaller issues.
The first concerns the qualification of which ‘computational problems’ would fit such method. Intuitively, when one refers to the use of computational techniques to address some scientific problem, the latter can come from a variety of backgrounds. We might be interested in computing the value of some equations to test the stability of a bridge. Or we might be interested in knowing the best-fit curve for the increase of some disease, economic behaviour or demographic factor in a given social group. Or we might be interested in investigating a biological entity. These cases highlight the old role of computing as a technique to facilitate and speed-up the process of extracting data and possibly suggest correlations within a well-specified scientific context: computational physics, chemistry, econometrics, biology.
An essential characteristic of scientific experiments is their repeatability.
But besides the understanding of ‘computational experiment’ as the computational study of a non-computational phenomenon, the computational sciences themselves offer problems that can be addressed computationally: how stable is your internet connection? How safe is your installation process when external libraries are required? How consistent are the data extracted from some sample? Just to outline some. These problems (or their formal models) are investigated through computational experiments, but they seem to be less easily identified with scientific problems.
The second: how to formulate a good hypothesis for a computational experiment? Scientific hypotheses depend on the system of reference and, in the case of their translation to a computational setting, we have to be careful that the relevant properties of the system under observation are preserved. An additional complication is presented when the observation itself concerns a computational system, which might include a formal system, a piece of software, or implemented artefacts. Each of the levels of abstraction pertaining to computing reveals a specific understanding of the system, and they can all be taken as essential in the definition of a computing system. Is then a hypothesis on such systems admissible if formulated at only one such level of abstraction e.g. considering a piece of code but not its running instances? And is such an hypothesis still well-formulated enough if it tries instead to account for all the different aspects that a computational system present?
Finally, an essential characteristic of scientific experiments is their repeatability. In computing, this criterion can be understood and interpreted differently: should an experiment be repeatable under exactly the same circumstances for exactly the same computational system? Should it be repeatable for a whole class of systems of the same type? How do we characterize typability in the case of software? and how in the case of hardware?
All the above questions underpin our understanding of what a computational experiment is. Although we are used to expecting some scientific uniformity in the notion of experiment, the case of CS evades such strict criteria. First of all, several sub-disciplines categorise experiments in very specific ways, each not easily applicable by the research group next-door: testing a piece of software for requirements satisfaction is essentially very different from testing a robotic arm for identifying its own positioning.
Experiments in the computational domain do not offer the same regularities that can be observed in the physical, biological and even social sciences. The notion of experiments is often confounded with the more basic and domain-related activity of performing tests. For example, model-based testing is a well-defined formal and theoretical method that differs from computer-simulations in both admissible techniques, recognised methodology, assumptions and verifiability of results. Accordingly, the process of checking an hypothesis that characterises the scientific method described above is often intended simply as testing or checking some functionality of the system at hand, while in other cases it implies a much stronger theoretical meaning. Here the notion of repeatability (of an experiment) merges with the replicability (of an artefact) – a distinction that has already appeared in the literature (Drummond).
Finally, benchmarking is understood as an objective performance evaluation of computer systems under controlled conditions: is it in some sense characterising the quality of computational experiments, or simply identifying the computational artefacts that can be validly subject to experimental practices?
A philosophical analysis
The philosophical analysis on the methodological aspects of CS, of which the above is an example, is a growing research area. The set of research questions that need to be approached is large and diversified. Among these, the analysis on the role of computational experiments in the sciences is not a new one, though less understood is the methodological role of computer simulations in CS, rather than as a support method for testing hypotheses in other sciences.
The Department of Computer Science at Middlesex University is leading both research and teaching activities in this area, in collaboration with several European partners, including the Dipartimento di Elettronica, Informazione e Bioingeneria at Politecnico di Milano in Italy, which offers similar activities and has a partnership with Middlesex through the Erasmus+ network.
In an intense one-week visit, we drafted initial research questions and planned future activities. The following questions represent a starting point for our analysis:
These questions require considering the different types of computer simulations, as well as other types of computational experiments, along with the specificities of the problems treated. For example, an agent-based simulation of a messaging system underlies problems and offers results that are inherently different from the testing with real users of a monitoring systems for privacy on social networks. The philosophical analysis on the methodological aspects of CS impacts not only the discussion about the discipline, but also on how its disciplinary status is acknowledged by a larger audience.
Nowadays we are getting used to reading about the role of computational experiments in scientific research and how computer-based results affect the progress of science. It is about time that we become clear about their underlying methodology, so that we might say with some degree of confidence what their real meaning is.