Motivation: Big Software on the Run

Software forms an integral part of the most complex artifacts built by humans. Software systems may comprise hundreds of millions of program statements, written by thousands of different programmers, spanning several decades. Their complexity surpasses the comprehensive abilities of any single, individual human being. Accordingly, we have become totally dependent on complex software artifacts. Communication, production, distribution, healthcare, transportation, education, entertainment, government, and trade all increasingly rely on “Big Software”. Unfortunately, we only recognize our dependency on software when it fails. Malfunctioning information systems of the Dutch police force and the Dutch tax authority, outages of electronic payment and banking systems, increasing downtime of high-tech systems, unusable phones after updates, failing railway systems, and tunnel closures due to software errors illustrate the importance of good software.

The following developments suggest that without a radical change of paradigm problems will only increase:

  • Growing complexity: software systems do not operate in a stand-alone manner, but are increasingly inter-connected resulting in complex distributed systems.
  • Growing scale: an increasing number of organizations use shared infrastructures (e.g. cloud computing), the number of devices connected to the internet is increasing, and an increasing amount of data is recorded (e.g. sensor data, RFID data, etc.).
  • Increasing diversity: there is a growing diversity in platforms (covering traditional CPUs, multi-core architectures, cloud-based data centers, mobile devices, and the internet of things), versions (different releases having different capabilities), and configurations.
  • Continuous evolution of software: late composition (components are assembled and connected while they are running), remote software updates (removing old errors but introducing new ones), and functional extensions (by merely changing running software) lead to unpredictable and unforeseen behavior.
  • Continuously changing environment: the software must run in an ever changing context of (virtualized) hardware, operating systems, network protocols and standards, and must cope with wild variations of available resources, such as computer cores, bandwidth, memory and energy. Moreover, the software may be applied in ways not anticipated at design time.
  • Increasing demands related to security and trust: as our reliance on software grows, concerns about security and privacy increase, while the software only becomes an ever more tempting target for attacks.

Taming the complexity of software has been an ongoing concern of computer science since its very inception. Traditionally, scientists attempt to ensure that software is built to satisfy stringent requirements. This a priori approach assumes that one has total control over the production process of all software and has demonstrated its value in stable environments. However, the traditional a priori approach is unable to deal with the growing complexity and diversity of software systems operating in continuously evolving environments demanding on-the-fly changes to software. Arguably, software systems are among the most complex artifacts humanity has ever produced. However, we expect a software system to run on different platforms, cooperate with an array of unknown systems, be used in a way not envisioned at design time, and adapt to changing requirements. To meet these high expectations, we need to really understand complex evolving software systems and consider this one of the grand challenges of our lifetime.

To deal with Big Software on the Run (BSR), we propose to shift the main focus from a priori software design to a posteriori software analytics thereby exploiting the large amounts of event data generated by today's systems. The core idea is to study software systems in vivo, i.e., at runtime and in their natural habitat. We would like to understand the actual (desired or undesired) behavior of software. Running software needs to adapt to evolving and diverging environments and requirements. This forces us to consider software artifacts as “living organisms operating in changing ecosystem”. This paradigm shift requires new forms of empirical investigation that go far beyond the common practice of collecting error messages and providing software updates.

Hence, there is an urgent need to develop innovative techniques to discover how systems really function, check where systems deviate from the desired and expected behavior, predict the reliability, performance and security over time, and recommend changes to address current or future problems. These techniques need to be able to deal with torrents of event data (“Big Data”) and extremely complex software (“Big Software”). The urgency of software-related problems and the opportunities provided by such a new focus justify a dedicated BSR research program.