- Information
- AI Chat
Chapter 11 Software Reliability AND Quality Management
Principles Of Computer Science
Chhatrapati Shahu Ji Maharaj University
Students also viewed
- DBMS 35 merged-1
- Java operators
- 8 Software testing - Hey friends This thread contains quality notes/handouts for the subject Principles
- 9 Software evolution - Hey friends This thread contains quality notes/handouts for the subject Principles
- 7 Design and Implementation
- 12 Dependability and security specification
Preview text
Chapter 11 SOFTWARE RELIABILITY AND
QUALITY MANAGEMENT
Reliability of a software product is an important concern for most users. Users not only want the products they purchase to be highly reliable, but for certain categories of products they may even require a quantitative guarantee on the reliability of the product before making their buying decision. This may especially be true for safety-critical and embedded software products. However, as we discuss in this Chapter, it is very difficult to accurately measure the reliability of any software product. One of the main problems encountered while quantitatively measuring the reliability of a software product is the fact that reliability is observer-dependent. That is, different groups of users may arrive at different reliability estimates for the same product. Besides this, several other problems (such as frequently changing reliability values due to bug corrections) make accurate measurement of the reliability of a software product difficult. We investigate these issues in this chapter. Even though no entirely satisfactory metric to measure the reliability of a software product exists, we shall discuss some metrics that are being used at present to quantify the reliability of a software product. We shall also address the problem of reliability growth modelling and examine how to predict when (and if at all) a given level of reliability will be achieved. We shall also examine the statistical testing approach to reliability estimation. In this chapter, in addition to software reliability issues, we shall also
discuss various issues associated with software quality assurance (SQA). Software quality assurance (SQA) has emerged as one of the most talked about topics in recent years in software industry circle. The major aim of SQA is to help an organisation develop high quality software products in a
repeatable manner. A software development organisation can be called repeatable when its software development process is person- independent. That is, the success of a project does not depend on who exactly are the team members of the project. Besides, the quality of the developed software and the cost of development are important issues addressed by SQA. In this chapter, we first discuss a few important issues concerning software reliability measurement and prediction before starting our discussion on software quality assurance.
11 SOFTWARE RELIABILITY
The reliability of a software product essentially denotes its trustworthiness or dependability. Alternatively, the reliability of a software product can also be defined as the probability of the product working ―correctly‖ over a given period of time. Intuitively, it is obvious that a software product having a large number of defects is unreliable. It is also very reasonable to assume that the reliability of a system improves, as the number of defects in it is reduced. It would have
If an error is removed from an instruction that is frequently executed (i.,
belonging to the core of the program), then this would show up as a large improvement to the reliability figure. On the other hand, removing errors from parts of the program that are rarely used, may not cause any appreciable change to the reliability of the product. Based on the above discussion we can say that reliability of a product depends not only on the number of latent errors but also on the the exact location of the errors. Apart from this, reliability also depends upon how the product is used, or on its execution profile. If the users execute only those features of a program that are ―correctly‖ implemented, none of the errors will be exposed and the perceived reliability of the product will be high. On the other hand, if only those functions of the software which contain errors are invoked, then a large number of failures will be observed and the perceived reliability of the system will be very low. Different categories of users of a software product typically execute different functions of a software product. For example, for a Library Automation Software the library members would use functionalities such as issue book, search book, etc., on the other hand the librarian would normally execute features such as create member, create book record, delete member record, etc. So defects which show up for
the librarian, may not show up for the members. Suppose the functions of a Library Automation Software which the library members use are error- free; a n d functions used by the Librarian have many bugs. Then, these two categories of users would have very different opinions about the reliability of the software. Therefore, Based on the above discussions, we can summarise the main reasons that make software reliability more difficult to measure than hardware reliability: The reliability improvement due to fixing a single bug depends on where the bug is located in the code. The perceived reliability of a software product is observer-dependent. The reliability of a product keeps changing as errors are detected and fixed. In the following subsection, we shall discuss why software reliability measurement is a harder problem than hardware reliability measurement.
11.1 Hardware versus Software Reliability
An important characteristic feature that sets hardware and software reliability issues apart is the difference between their failure patterns.
Hardware components fail due to very different reasons as compared to software components. Hardware components fail mostly due to wear and tear, whereas software components fail due to bugs. A logic gate may be stuck at 1 or 0, or a resistor might short circuit. To fix a hardware fault, one has to either replace or repair the failed part. In contrast, a software product would continue to fail until the error is tracked down and
at a good discount to its face value) towards the end of its life time, That is, one need not feel happy to buy a ten year old car at one tenth of the price of a new car, since it would be near the rising edge of the bath tub curve, and one would have to spend unduly large time, effort, and money on repairing and end up as the loser. In contrast to the hardware products, the software product show the highest failure rate just after purchase and installation (see the initial portion of the plot in Figure 11 (b)). As the system is used, more and more errors are identified and removed resulting in reduced failure rate. This error removal continues at a slower pace during the useful life of the product. As the software becomes obsolete no more error correction occurs and the failure rate remains unchanged.
Figure 11: Change in failure rate of a product.
11.1 Reliability Metrics of Software Products
The reliability requirements for different categories of software products may be different. For this reason, it is necessary that the level of reliability required for a software product should be specified in the software requirements specification (SRS) document. In order to be able to do this, we need some metrics to quantitatively express the reliability of a software product. A good reliability measure should be observer-independent, so that different people can agree on the degree of reliability a system has. However, in practice, it is very difficult to formulate a metric using which precise reliability measurement would be possible. In the absence of such measures, we discuss six metrics
that correlate with reliability as follows: Rate of occurrence of failure (ROCOF): ROCOF measures the frequency of occurrence of failures. ROCOF measure of a software product can be obtained by observing the behaviour of a software product in operation over a specified time interval and then calculating the ROCOF value as the ratio of the total number of failures observed and the duration of observation. However, many software products do not run continuously (unlike a car or a mixer), but deliver certain service when a demand is placed on them. For example, a library software is idle until a book issue request is made. Therefore, for a typical software product such as a pay-roll software, applicability of ROCOF is very limited. Mean time to failure (MTTF): MTTF is the time between two successive failures, averaged over a large number of failures. To measure MTTF, we can record the failure data for n failures. Let the failures occur at the time
instants t1, t2, ..., tn. Then, MTTF can be calculated as . It is important to note that only run time is considered in the time measurements. That is, the time for which the system is down to fix the error, the boot time, etc. are not taken into account in the time measurements and the clock is stopped at these times. Mean time to repair (MTTR): Once failure occurs, some time is required to fix the error. MTTR measures the average time it takes to track the errors causing the failure and to fix them. Mean time between failure (MTBF): The MTTF and MTTR metrics can be
supposed to be never down and where repair and restart time are significant and loss of service during that time cannot be overlooked.
Shortcomings of reliability metrics of software products
All the above reliability metrics suffer from several shortcomings as far as their use in software reliability measurement is concerned. One of the reasons is that these metrics are centered around the probability of
occurrence of system failures but take no account of the consequences of failures. That is, these reliability models do not distinguish the relative severity of different failures. Failures which are transient and whose consequences are not serious are in practice of little concern in the operational use of a software product. These types of failures can at best be minor irritants. On the other hand, more severe types of failures may render the system totally unusable. In order to estimate the reliability of a software product more accurately, it is necessary to classify various types of failures. Please note that the different classes of failures may not be mutually exclusive. The classification is based on widely different set of criteria. As a result, a failure type can at the same time belong to more than one class. A scheme of classification of failures is as follows: Transient: Transient failures occur only for certain input values while invoking a function of the system. Permanent: Permanent failures occur for all input values while invoking a function of the system. Recoverable: When a recoverable failure occurs, the system can recover
without having to shutdown and restart the system (with or without operator intervention). Unrecoverable: In unrecoverable failures, the system may need to be restarted. Cosmetic: These classes of failures cause only minor irritations, and do not lead to incorrect results. An example of a cosmetic failure is the situation where the mouse button has to be clicked twice instead of once to invoke a given function through the graphical user interface.
11.1 Reliability Growth Modelling
A reliability growth model is a mathematical model of how software reliability improves as errors are detected and repaired. A reliability growth model can be used to predict when (or if at all) a particular level of reliability is likely to be attained. Thus, reliability growth modelling can be used to determine when to stop testing to attain a given reliability level. Although several different reliability growth models have been proposed, in this text we will discuss only two very simple reliability growth models.
Jelinski and Moranda model
The simplest reliability growth model is a step function model where it is assumed that the reliability increases by a constant increment each time an error is detected and repaired. Such a model is shown in Figure 11. However, this simple model of reliability which implicitly assumes that all errors contribute equally to reliability growth, is highly unrealistic since we already know that correction of different errors contribute differently to reliability growth.
we denote the set of various functionalities offered by the software by {fi}, the operational profile would associate with each function {fi} with the probability with which an average user would select {fi} as his next function to use. Thus, we can think of the operation profile as assigning a probability value pi to each functionality fi of the software.
How to define the operation profile for a product?
We need to divide the input data into a number of input classes. For example, for a graphical editor software, we might divide the input into data associated with the edit, print, and file operations. We then need to assign a probability value to each input class; to signify the probability for an input value from that class to be selected. The operation profile of a software product can be determined by observing and analysing the usage pattern of the software by a number of users.
11.2 Steps in Statistical Testing
The first step is to determine the operation profile of the software. The next step is to generate a set of test data corresponding to the determined operation profile. The third step is to apply the test cases to the software and record the time between each failure. After a statistically significant number of failures have been observed, the reliability can be computed. For accurate results, statistical testing requires some fundamental assumptions to be satisfied. It requires a statistically significant number of test cases to be used. It further requires that a small percentage of test inputs that are likely to cause system failure to be included. Now let us
discuss the implications of these assumptions. It is straight forward to generate test cases for the common types of inputs, since one can easily write a test case generator program which can
automatically generate these test cases. However, it is also required that a statistically significant percentage of the unlikely inputs should also be included in the test suite. Creating these unlikely inputs using a test case generator is very difficult.
Pros and cons of statistical testing
Statistical testing allows one to concentrate on testing parts of the system that are most likely to be used. Therefore, it results in a system that the users can find to be more reliable (than actually it is!). Also, the reliability estimation arrived by using statistical testing is more accurate compared to those of other methods discussed. However, it is not easy to perform the statistical testing satisfactorily due to the following two reasons. There is no simple and repeatable way of defining operation profiles. Also, the the number of test cases with which the system is to be tested should be statistically significant.
11 SOFTWARE QUALITY
Traditionally, the quality of a product is defined in terms of its fitness of purpose. That is, a good quality product does exactly what the users want it to do, since for almost every product, fitness of purpose is interpreted in terms of satisfaction of the requirements laid down in the SRS document. Although ―fitness of purpose‖ is a satisfactory definition of quality for many products such as a car, a table fan, a grinding machine, etc.—―fitness of purpose‖ is not a wholly satisfactory definition
the product, and the functionalities of the product can be easily modified, etc.
McCall’s quality factors
McCall distinguishes two levels of quality attributes [McCall]. The higher- level attributes, known as quality factor s or external attributes can only be measured indirectly. The second-level quality attributes are called quality criteria. Quality criteria can be measured directly, either objectively or subjectively. By combining the ratings of several criteria, we can either obtain a rating for the quality factors, or the extent to which they are satisfied. For example, the reliability cannot be measured directly, but by measuring the number of defects encountered over a period of time. Thus, reliability is a higher-level quality factor and number of defects is a low-level quality factor.
ISO 9126
ISO 9126 defines a set of hierarchical quality characteristics. Each subcharacteristic in this is related to exactly one quality characteristic. This is in contrast to the McCall’s quality attributes that are heavily interrelated. Another difference is that the ISO characteristic strictly refers to a software product, whereas McCall’s attributes capture process quality issues as well. The users as well as the managers tend to be interested in the higher- level quality attributes (quality factors).
11 SOFTWARE QUALITY MANAGEMENT SYSTEM
A quality management system (often referred to as quality system) is
the principal methodology used by organisations to ensure that the products they develop have the desired quality. In the following subsections, we briefly discuss some of the important issues associated with a quality system:
Managerial structure and individual responsibilities
A quality system is the responsibility of the organisation as a whole. However, every organisation has a separate quality department to perform several quality system activities. The quality system of an organisation should have the full support of the top management. Without support for the quality system at a high level in a company, few members of staff will take the quality system seriously.
Quality system activities
The quality system activities encompass the following: Auditing of projects to check if the processes are being followed. Collect process and product metrics and analyse them to check if quality goals are being met. Review of the quality system to make it more effective. Development of standards, procedures, and guidelines. Produce reports for the top management summarising the effectiveness of the quality system in the organisation. A good quality system must be well documented. Without a properly documented quality system, the application of quality controls and procedures become ad hoc, resulting in large variations in the quality of the products delivered. Also, an undocumented quality system sends clear messages to the staff about the attitude of the organisation towards quality assurance. International standards such as ISO 9000 provide guidance on how to organise a quality system.
11.4 Evolution of Quality Systems
Quality systems have rapidly evolved over the last six decades. Prior to World War II, the usual method to produce quality products was to inspect the finished products to eliminate defective products. For example, a company manufacturing nuts and bolts would inspect its finished goods and would
(BPR), which is aims at re-engineering the way business is carried out in an organisation, whereas our focus in this text is re-engineering of the software development process. From the above discussion, we can say that over the l a s t six decades or so, the quality paradigm has shifted from product assurance to process assurance (see Figure 11).
11.4 Product Metrics versus Process Metrics
All modern quality systems lay emphasis on collection of certain product and process metrics during product development. Let us first understand the basic differences between product and process metrics. Product metrics help measure the characteristics of a product being developed, whereas process metrics help measure how a process is performing. Examples of product metrics are LOC and function point to measure size, PM (person- month) to measure the effort required to develop it, months to measure the time required to develop the product, time complexity of the algorithms, etc. Examples of process metrics are review effectiveness, average number of defects found per hour of inspection, average defect correction time, productivity, average number of failures detected during testing per LOC, number of latent defects per line of code in the developed product.
11 ISO 9000
International standards organisation (ISO) is a consortium of 63 countries established to formulate and foster standardisation. ISO
published its 9000 series of standards in 1987.
11.5 What is ISO 9000 Certification?
ISO 9000 certification serves as a reference for contract between independent parties. In particular, a company awarding a development contract can form his opinion about the possible vendor performance based on whether the vendor has obtained ISO 9000 certification or not. In this context, the ISO 9000 standard specifies the guidelines for maintaining a quality system. We have already seen that the quality system of an organisation applies to all its activities related to its products or services. The ISO standard addresses both operational aspects (that is, the process) and organisational aspects such as responsibilities, reporting, etc. In a nutshell, ISO 9000 specifies a set of recommendations for repeatable and high quality product development. It is important to realise that ISO 9000 standard is a set of guidelines for the production process and is not directly concerned about the product it self. ISO 9000 is a series of three standards—ISO 9001, ISO 9002, and ISO 9003. The ISO 9000 series of standards are based on the premise that if a proper process is followed for production, then good quality products are bound to follow automatically. The types of software companies to which the different ISO standards apply are as follows: ISO 9001: This standard applies to the organisations engaged in design, development, production, and servicing of goods. This is the standard that is applicable to most software development organisations. ISO 9002: This standard applies to those organisations which do not design
Chapter 11 Software Reliability AND Quality Management
Course: Principles Of Computer Science
University: Chhatrapati Shahu Ji Maharaj University
- Discover more from:
Students also viewed
- DBMS 35 merged-1
- Java operators
- 8 Software testing - Hey friends This thread contains quality notes/handouts for the subject Principles
- 9 Software evolution - Hey friends This thread contains quality notes/handouts for the subject Principles
- 7 Design and Implementation
- 12 Dependability and security specification