Home > Research  

Scope

The relentless pressure to keep up with "Internet Time" results in most organizations using ad hoc approaches to survive on a daily basis, with no time or energy left for long-term investments in surviving the coming months and years. While such an approach can be made to work in the short term, it is inherently inadequate at addressing trends over the span of years or decades. Instead, it is vital that a concerted effort be made to prepare for downstream problems in a number of key areas.

The long-term scope will evolve as appropriate to address the hard, long-term problems facing us. Current areas include:

  1. Use of off-the-shelf components:
    Most systems now rely heavily on the use of commercial off-the-shelf (COTS) technology for hardware and/or software for reasons of cost and time to market. Many current approaches to creating dependable systems assume complete control and understanding of system components---an assumption that is simply not representative of the majority of systems that must be built. And, even if complete understanding of components were possible, the marketplace is such that components become obsolete and are replaced many times over during the production and deployment life of many critical systems. New techniques are urgently needed to create highly dependable systems from "black-box" components that continually change. Previously useful approaches and simpler forms of analysis (e.g., old notions of creating components based on separation of concerns and creating systems based on synthesis rather than component composition no longer work for every situation).
  2. Use of complex, non-dependable components:
    Achieving high confidence is becoming more difficult as systems become more complex. Today's trends of large-scale use of component technology, increased integration, continuous evolution, and larger scale are yielding more complex systems. Furthermore, such systems are often build of complex components that are not inherently dependable. Not only is it difficult to get such systems to work in the first place, but furthermore such systems frequently exhibit unpredictable emergent behaviors at inopportune moments. New ways to create dependable systems from complex components are urgently needed.
  3. Hostile operating environments:
    Lacking adequate protection, today's information and communications systems are being subjected to numerous malicious attacks. New and advanced techniques are required to achieve required levels of system integrity and availability. Protection against both active and insider threats must be developed. Methods are needed for system monitoring, detection, response, and recovery.
  4. Embedded Systems:
    Embedded computer systems are arguably both more difficult to make dependable, and more in need of complete dependability. Because they often do not have a human operator acting as a safety net, embedded systems must achieve absolutely bulletproof operation over years or decades of time. But, because the actual amount of computational power used is small, such systems are often perceived as easy to build and are often created by engineers or technicians with no formal training in software engineering or critical system design. Whereas desktop computers are built in the tens of millions per year, embedded microcontrollers are produced in the billions---soon to be tens of billions per year. The challenge is how to scale high assurance methods down to the budgets, timelines, and skill sets prevalent in the embedded system world.
  5. Ubiquitous critical systems:
    The days of critical systems being a niche market are over. Many everyday safety critical systems will soon have or already have software in them. Consider, for example, a domestic hot water heating system, which can cause scalding burns if it drifts even a few degrees higher than its set point. Or, consider an Internet-based stock trading system that can bankrupt a user who (foolishly) depends on typical response times being available during a stock market meltdown. As we entrust our lives and livelihoods to computers, many systems will effectively become critical. A challenge here is how to proliferate good practice in highly dependable system design to everyday practitioners rather than a few select critical system designers in niche fields such as nuclear power and aerospace applications.
  6. Indirectly critical systems:
    As computer systems are becoming highly complex, so is our society. While the number of critical systems is growing, the number of indirectly critical systems also grows. For example, the software that routes messages for a personal pager system becomes indirectly critical when it transmits the page for an emergency room physician to respond to a crisis. Similarly, database software becomes indirectly critical when it identifies owners of vehicles subject to an urgent recall notice or is used to look up emergency contact information. Even a simple word processor can become mission critical if it crashes a few minutes before the courier pickup deadline for a proposal submission. It is vital that even everyday, seemingly non-critical, applications be raised to a higher level of dependability to reduce the enormous hidden costs their unreliability levies on businesses and individuals.
  7. International markets:
    The U.S. is not alone in its growing dependence on computing throughout industries having safety-critical aspects. This is especially true in transportation, health care, energy, and manufacturing sectors. However, many areas do not have the technical and labor infrastructures to support critical system operation. It will be imperative to create dependable systems that can operate properly even with shortages of repair parts, scarce availability of skilled operators/maintainers, and erratically available infrastructure support.

Activities

Six research and education activities will contribute to the HDCC strategic goals:

1. Provide a sound theoretical, scientific and technological basis for assured construction of safe, secure systems.

To meet this goal, the research agenda
must:

  • achieve the capability to specify,
  • compose, analyze, and assess system behavioral properties,
  • furnish the capability to enforce specific behavioral properties,
  • and furnish the capability to be more predictably tolerant of specified behavioral failures including malicious attack.

These are still hot topics in universities despite the general acceptance of C (and perhaps, someday, Java) as do-everything programming languages. Ultimately, the proper and reliable functioning of a system depends upon people describing their designs in a formal specification, namely a language. When the language is shaky, the entire edifice will be built on a soft foundation. Special areas of interest include applications of logic, techniques for designing and implementing programming languages, and formal specification and verification of hardware and software systems. It is important to apply these techniques to problems of realistic scale and complexity, for example: implementation of high speed network communication software and application of type theoretic principles in the construction of compilers for proof carrying code. For Carnegie Mellon activities in principles of programming see http://www.cs.cmu.edu/Groups/pop/pop.html

2. Develop hardware, software, and system engineering tools that incorporate ubiquitous, application-based, domain-based, and risk-based assurance.

To meet this goal the HDCC research agenda must:

  • furnish the methods, tools, and environments necessary for the design, construction, and evaluation of behavioral enforcement mechanisms;
  • and establish indicators and characteristics of overall system confidence in the achieved behavioral properties gained through the application of such methods, tools and environments.

Software Engineering has grown into a field of Computer Science in its own right. Its aim is that systems constructed from software can attain the same reliability and predictability as bridges and other symbols of engineering excellence. At Carnegie Mellon much of the research and education in this field is conducted by the Institute for Software Research (http://www.isri.cs.cmu.edu/) and the Software Engineering Institute (http://www.sei.cmu.edu/).

3. Reduce the effort, time, and cost of assurance and quality certification processes.

To meet this goal, the HDCC research agenda must:

  • furnish the means to improve the productivity of information system design, development, and analysis,
  • while simultaneously improving the levels of confidence that can be achieved through such productivity enhancements.

The industrial use of system analysis and verification tools has been limited, but university researchers have made considerable progress in producing tools that find bugs in real hardware and software. So far, most of the success has been in hardware where complexity is lower and specifications cleaner; but there have been promising successes in software as well. For Carnegie Mellon activities in formal systems see
http://www.cs.cmu.edu/Groups/
formal-methods/formal-methods.html

4. Understand the human problems in creating, maintaining, and using computer systems.

This has become a vital area of research as computers have become ubiquitous. Seat-of-the-pants design might have been sufficient when the users of computers were engineers, scientists, and programmers; but now a deep understanding of human capabilities must be built into design because the users are often very different from the designers. "Pilot error" is the most frequently cited cause of airline mishaps, and "programmer error" is similarly often the purported cause of software defects, except in the frequent case in which problems are blamed on "user error". We need to understand and account for the capabilities of both the designers and end users of systems. For Carnegie Mellon activities in human-computer interaction see http://www.hcii.cmu.edu/.

5. Provide measures of results.

To meet this goal, the HDCC research agenda must:

  • develop measures of performance and measures of effectiveness for use in quantifying and qualifying the progress of improvements in system-level confidence that can be achieved through the application of HDCC technologies.
  • Further, the agenda must show through such measures that the benefits achieved are cost effective.

One reason to do system fault discovery is to find a metric. Fault discovery is only somewhat helpful as a debugging technique---it is much more powerful as a quality assurance technique in support of building dependable systems. For some Carnegie Mellon research in this area see
http://www.ices.cmu.edu/ballista

6.Promote software engineering education.

Currently, de facto software engineers coming from universities are emerging from departments of computer science and engineering. Unfortunately the computer scientists are often too theoretical while the engineers are often too hardware-oriented. What is needed is professional education akin to what medical doctors receive, but nobody is doing it. Both software engineering research and education must have strong connections to practice: education needs a practical setting to develop skill, and research needs access to real problems that expose the deep issues involved in
real-world development.

We should create an institution that serves software engineering as a teaching hospital serves medicine. Students would learn in the context of real cases. Clinical faculty would both practice and teach. Research would exploit access to real cases and data. We would provide a development laboratory in which real software developers produce real software for real clients. Developers would interact with researchers to infuse the research agenda with visibility into real problems, and developers can take advantage of research results. Students would learn through direct experience in a real---not just "realistic"---setting. Clinical faculty would be skilled professional software developers and have significant responsibilities for both teaching and software production.

Reprinted with kind permission from
Jim Morris, Dean, Carnegie Mellon School of Computer Science, from his essay on a High Dependability Computing Consortium in which he suggests that universities, government, and industry should initiate a long-term research and education program to make computing and communication systems dependable enough for people to trust with their everyday lives and livelihoods.

 

Links Workshops Contact Information News About HDCC HDCC Home Page NASA Carnegie Mellon University HDCC home page