1) The extension of the concepts of reliability
When we consider a system which is not only in one of the two
states: up-state (the system is failure-free and thus capable of full
performance) or down-state (the system is totally inoperable and
under repair), but may also perform its function at one or more
levels of reduced efficiency, the conventional concepts of reliability
are found to be unsuitable and inadequate: the reliability of the
system remains unresolved (there exist situations when the system
is neither fully operable nor fully inoperable, so that reliability
cannot be determined at all), or it gets a value which contradicts
empirical observation (if operation with reduced efficiency is
regarded as normal operation, too high reliability is obtained; if a
reduction in efficiency is regarded as total inoperability, too low
reliability is obtained).
Our first objective is to extend the concepts of reliability in order to
make it possible to determine also the reliability of systems with
states of reduced efficiency. In making the extension, care must be
taken that it is done in a theoretically wellfounded and empirically
adequate way. Furthermore there must be no violation of the
traditional concepts. These conditions will be fulfilled, when we set
for the new concepts the following requirements:
1. Failures having a limiting effect on the efficiency of the system are
referred to factors which decrease the reliability of the system, but do
this to an extent less than the decrease in reliability caused by a
failure resulting in total system inoperability. Further, the degree of
reliability decrease is dependent on the degree of reduction in
efficiency: the more serious the consequences of the failure, the
greater the decrease in reliability of the system.
2. When the new, more comprehensive concepts of reliability are
applied to general systems with many levels of performance, we get
empirical interpretations analogical to those which result when the
conventional concepts of reliability are applied to ordinary two-stage,
operable or inoperable systems.
3. When a two-stage, operable or inoperable system is under
consideration the new concepts are in agreement with the
traditional concepts of reliability.
4. The mathematical definition of the new concepts remains within
the limits of the general mathematical definition of reliability
(Gnedenko et al. 1969)
5. The numerical value of reliability can be determined directly from
the behaviour of the system, i.e. from the state probabilities of the
system.
This conceptual analysis of reliability will be carried out in Chapter 3.
Explicitly we carry out the extension of the concepts only for the
quantitative characteristics of reliability. We give the general
principles of the extension procedure and derive in detail new, more
comprehensive reliability characteristics corresponding to the
characteristics 'availability', 'reliability' and 'mean time to system
failure' of traditional reliability. In the course of the derivation we
show that the new characteristics are theoretically well-founded and
empirically adequate; the five requirements, and more generally, the
objectives laid down for the extension procedure are thereby met.
2) Formulation and use of the reliability model for the reliability
analysis of a stochastic system with states of reduced efficiency
The second main objective in the study is to determine and analyze
the reliability of a stochastic system, which besides the modes of
normal operation and total failure also possesses the property of
operation at several different levels of performance (i.e. with
reduced efficiency). The system has three operation modes: "normal
operation", "operation with reduced efficiency", and "non-
operation". We have chosen it as a general representative of systems
with states of reduced efficiency. We have tried especially to include
in our system the typical main features of a processing factory. In our
system these main features have been described by means of the
following four types of components:
(i) the ordinary two-stage operable or inoperable components; the
failure of any one of the components renders the whole system
inoperable (subsystem S1)
(ii) the functionally multi-stage components, the failure of any one
of which makes the component (and the whole system) operate with
reduced efficiency; the degree of reduction in efficiency depends on
which component has failed (subsystem S2)
(iii) the ordinary two-stage operable or inoperable components in
parallel redundancy; the system fails only when all the redundant
components have failed (subsystem S3)
(iv) the ordinary two-stage operable or inoperable components in the
subsystem formed by independent, parallel branches; the failure of
one or more of the components (branches) makes the system operate
with reduced efficiency, the degree of reduction in efficiency
depending on the number of simultaneous failures among the
components (subsystem S4)
The total inoperability of the subsystem or its operability at a level of
reduced performance is a consequence of one or more failures
among the components of the subsystem. At any time, the
subsystem functioning at the lowest level of performance
determines the performance level of the whole system, the
subsystems being connected in series. Due to the combinations of the
performance levels of the subsystems, the system has a great number
of possible levels of performance, ranging from normal operability
thought different degrees of reduced efficiency to total inoperability.
The system is assumed to be maintained by a single repair facility so
that only one failure can be repaired at a time. Because there may be
several failures among the components at the same time and there
is only one repair facility, the failed components must sometimes
queue for repair. In the handling of this queue we assume that the
preemptive repeat repair discipline is followed. Under this repair
policy, different repair priorities are assigned to different
components and different types of failures, and the repairs are
carried out according to these priorities.
The components of the system are assumed to fail with constant
failure rates, i.e. the failure times are governed by exponential
distributions. The repair times of the components have general
distributions, i.e. the repair rates of the components are allowed to be
wholly arbitrary functions of time (some regularity conditions must
be met, however). Both failure and repair time distributions are
peculiar to individual components.
We can now point to the following contributions concerning the
structure and properties of the system under study:
1. The system contains a unit (the subsystem with parallel branches)
of a type not considered earlier in mathematical reliability literature.
2. The system, consisting of four different types of subsystems with a
general number of components in each subsystem, is the largest and
most general theoretical system, the reliability of which has been
analyzed in the dynamic form.
3. The inclusion in the system of a new type of subsystem and the
complexity of the system itself are not only theoretically interesting
but also empirically relevant. For all the subsystems there exist clear
counterparts in reality among production systems, for example and
especially in processing factories.
For the reliability analysis of the system we construct a mathematical
model. The formulation of the model starts with the definition of
the states for the system. Because the state (at time t) is an exact
description of the circumstances prevailing in the system at that
time, the behaviour of the system with the passage of time may be
found by determining the state probabilities of the system. Due to the
general repair time distributions the system is not Markovian.
However, by the inclusion of the supplementary variables we
provide a complete Markovian characterization of the system. After
the inclusion of the supplementary variables we can set up the
model. It gets the form of partial differential - difference equations
with variable coefficients. The solution of the model is derived by
the application of Laplace transforms and discrete transforms. Both
the time-dependent (transient state) and steady state solution are
considered. With general repair time distributions, the transient
state solution of the model stops at the Laplace transforms of the
state probabilities (which, with given repair distributions, we may
invert to give the state probabilities). Under the steady state on the
other hand, the use of the limit properties of Laplace transforms
leads us straight to the state probabilities proper.
In the reliability analysis of the system we link the two main
objectives of the study together. The reliability analysis of the system
is carried out within the framework of the new extended reliability
concepts. The characteristics of this extended reliability are now
derived on the basis of the solution of the model, on the basis of the
state probabilities, either directly (the generalized availability
characteristics) or after some modifications in the original model (
the generalized reliability and mean time to system failure
characteristics).
3) Development and wider application of the methods
Multi-component repairable systems with general failure and/or
repair time distributions are always difficult to handle
mathematically. Renewal theory and the Markov process approach
with the inclusion of supplementary variables are examples of the
probability tools, with the help of which the reliability analysis of
this type of complex system has turned out to be successful. We use
the latter approach in this study.
Due to the general repair time distributions in all of the
components, the system is not Markovian. But we can characterize
the system as a Markov system by employing a set of variables, the
supplementary variables, with the help of which a part of the
system's history (the time the component under repair has already
been being repaired) is included in the state definition of the system.
The supplementary variable technique proves to be very efficient
also in the complex system under study, in the system of four
subsystems with a general number of components in each. The
dynamic model for the behaviour of the system can be set up. It gets
the form of a set of differential - difference equations with respective
boundary and initial conditions. The equations have variable
coefficients.
As a consequence of the use of supplementary variables, the
equations become partial differential equations in the two time
variables. After using the Laplace transforms the equations become
algebraic in one variable and remain differential equations in other
variable. The equations thus become easier to solve in the Laplace
transforms domain than in the original time domain. At the same
time the equations are, however, difference equations in two (state
index) variables. The usual technique for solving difference
equations is to employ generating functions (z-transforms). But
because of the variable coefficients in the equations, the transformed
equations would now become partial differential equations also in
the transform variables.
These twice-transformed equations (Laplace transforms and
generating functions) with variable coefficients would then not be
much easier to solve than the original ones.
Because the use of generating functions in order to solve the model
turned out be troublesome or even impossible, we had to find some
other way. The method of discrete transforms was the tool that led to
the desired result. By using discrete transforms we can transform a
discrete set of numbers (or functions) to another discrete set of
numbers (or functions). Because in the transforms only multiplying
by binomial coefficients and summation are used, the inverse
transforms for the discrete transforms are easy to find (whereas the
derivation of inverse transforms for Laplace transforms and
generating functions may become very problematic). In our model
the Laplace-transformed equations, which are differential equations
in one variable and difference equations in two variables, become
after application of transforms and integration (to get rid of the
derivatives) algebraic equations, even linear in the unknown
functions. There are not, of course, any difficulties of principle in
solving such linear equations. This result, that the discrete
transforms lead to usual linear equations, is unknown to reliability
literature. In the earlier applications of discrete transforms, different
ad hoc -methods have been used for solving the transformed
equations.
(Doctoral thesis. Publications of the Institute for Applied
Mathematics, University of Turku, No. 10, 1977, 113 p.)