Modeling contaminant intrusion in water distribution networks based on D numbers
Abstract
Efficient modeling on uncertain information plays an important role in estimating the risk of contaminant intrusion in water distribution networks. DempsterShafer evidence theory is one of the most commonly used methods. However, the DempsterShafer evidence theory has some hypotheses including the exclusive property of the elements in the frame of discernment, which may not be consistent with the real world. In this paper, based on a more effective representation of uncertainty, called D numbers, a new method that allows the elements in the frame of discernment to be nonexclusive is proposed. To demonstrate the efficiency of the proposed method, we apply it to the water distribution networks to estimate the risk of contaminant intrusion.
keywords:
contaminant intrusion, water distribution networks, DempsterShafer evidence theory, D numbets, fuzzy numbers, belief function1 Introduction
Water supply systems are one of the most important fundamentals for human living and developmentsargaonkar2013model (); nyende2013application (); vairavamoorthy2007modelling (). The topic relating to the performance of the water supply systems under varied conditions has been paid considerable attention preis2008multiobjective (); tamminen2008water (); perelman2009extreme (); islam2013evaluating (); el2004fuzzy (). Water supply systems are usually designed, constructed, operated, and managed in an open environment. As a result, they are inevitably exposed to varied uncertain threats and hazards walski2003advanced (); islam2013evaluation (); xin2012hazard (); khanal2006distribution (). Contaminant intrusion in a water distribution network is a complex phenomenon, which depends on three elements  a pathway, a driving force and a contamination source lindley2001framework (); lindley2002assessing (); rasekh2014drinking (); shen2011false (); preis2007contamination (). However, the data on these elements are generally incomplete, nonspecific and uncertain sadiq2008predicting ().
Quantitative aggregation of incomplete, uncertain and imprecise information data warrants the use of soft computing methods zadeh1984review (); sadiq2006estimating (); jenelius2010critical (). Soft computing methods such as fuzzy set theory zadeh1965fuzzy (); deng2012fuzzy (); zhang2013ifsjsp (); aghaarabi2014compara (); setola2009critical (); oliva2011fuzzy (), rough set pawlak2007rudiments (); pawlak2007rough (), DempsterShafer evidence theory dempster1967upper (); shafer1976mathematical (); wei2013identifying () can essentially provide rational solutions for complex realworld problems. The traditional Bayesian (subjectivist) probability approach cannot differentiate between aleatory and epistemic uncertainties and is unable to handle nonspecific, ambiguous and conflicting information without making strong assumptions. These limitations can be addressed by the application of DempsterShafer evidence theory, which was found to be flexible enough to combine the rigor of probability theory with the flexibility of rulebased systems sadiq2006estimating (); deng2014environmental (); huang2013new ().
Due to the requirements of safety and reliability of the water supply system, risk assessment has been recognised as a useful tool to identify threats, analyse vulnerabilities and risks, and select mitigation measures for water supply systems li2007hierarchical (); weickgenannt2010risk (). Accordingly, an objectoriented approach for water supply systems is proposed in li2007hierarchical (), which is based on aggregative risk assessment (similar to sadiq2004aggregative ()) and fuzzy fault tree analysis and use fuzzy evidential reasoning method to determine the risk levels associated with components, subsystems, and the overall water supply system. Then evidential reasoning model based on DempsterShafer evidence theory is applied to estimate risk of contaminant intrusion in water distribution network sadiq2006estimating (); deng2011modeling ().
However, there are some shortcomings in previous methods. In the classical DempsterShafer evidence theory, a problem domain is indicated by the concept of frame of discernment, and the concept of basic probability assignment (BPA) is used to represent the uncertain information. But there are several hard hypotheses and constraints. For example, the elements in the frame of discernment must be mutually exclusive and the sum of BPA must be equal to 1, which is usually inconsistent with the applications above. These shortcomings have greatly limited its practical application deng2014environmental (); deng2012d ().
Recently, a new methodology called D numbers deng2014environmental (); deng2012d (); Deng2014DAHPSupplier (); Deng2014BridgeDNs () to represent uncertain information has been proposed, which is an extension of DempsterShafer evidence theory. D numbers can effectively represent uncertain information. The exclusive property of the elements in the frame of discernment is not required, and the completeness constraint is released. Due to the propositions of applications in the real word could not be strictly mutually exclusive, these two improvements are greatly beneficial. To get a more accurate uncertain data fusion, a discounting of D numbers based on the exclusive degree is necessary. In this paper, a new exclusive model based on D numbers to combine the uncertain data at a nonexclusive level was proposed.
The rest of the paper is organized as follows. Section 2, some definitions of D numbers is introduced. A stepbystep applications of the proposed model to a numerical example are illustrated in Section 3. In Section 4, the proposal exclusive model based on D numbers was applied to the water distribution networks to assess the risk of contaminant intrusion. Conclusions are given in Section 5.
2 D numbers
Situations in the real world are affected by many sources of uncertainty. Many existing theories have been developed to model various types of uncertainty with some desirable properties. However, these theories still contain deficiencies that can not be ignored. For example, due to the inherent advantages in the representation and handling of uncertain information, the DempsterShafer evidence theory is being studied for use in many fields. Such as decision making, pattern recognitiontao2012identification (), risk assessment sadiq2006estimating (); deng2011risk (), supplier selection and others. In the mathematical framework of DempsterShafer theory, the basic probability assignment (BPA) defined on the frame of discernment is used to express the uncertainty quantitatively. A problem domain indicated by a finite and mutually exclusive nonempty set is called a frame of discernment. Let denote the power set of . The elements in the are called propositions. The BPA is a mapping from to , and satisfying the following condition dempster1967upper (); shafer1976mathematical ():
(1) 
BPA has an advantage of directly expressing the ‘uncertainty’ by assigning the basic probability to the subsets of the set composed of individual objects, rather than to each of the individual objects. But there exists some strong hypotheses and hard constraints on the frame of discernment and BPA, which limit the practical application of DempsterShafer evidence theory. One of the hypotheses is that the elements in the frame of discernment are required to be mutually exclusive. However, this hypothesis is difficult to be satisfied in many situations. For example, the linguistic assessments shown in Fig. 1 can be "Low", "Fairly low", "Medium", "Fairly high", "High". Due to these assessments is based on human judgment, they inevitably contain intersections. The exclusiveness between these propositions can’t be guaranteed precisely, so that the application of DempsterShafer evidence theory is questionable and limited in this situation. That means, it is not correct to give a BPA like this: .
D numbersdeng2012d () is a new representation of uncertain information, which is an extension of DempsterShafer evidence theory. It overcomes the existing deficiencies in DempsterShafer evidence theory and appears to be more effective in representing various types of uncertainty. D numbers are defined as follows.
Let be a finite nonempty set, a D number is a mapping formulated by
(2) 
with
(3) 
It seems that the definition of D numbers is similar to the definition of BPA. But note that, the first difference is the concept of the frame of discernment in DempsterShafer evidence theory. The elements in the frame of discernment of D numbers do not require mutually exclusive. Second, the completeness constraint is released in D numbers. If , the information is said to be complete; if , the information is said to be incomplete. An illustrative example is given to show the D numbers as below.
Example 1. Suppose a project is assessed, the assessment score is represented by an interval [0, 100]. In the frame of DempsterShafer evidence theory, an expert could give a BPA to express his assessment result:
where , , . The set of is a frame of discernment in DempsterShafer evidence theory.
However, if another expert gives his assessment result by using D numbers, it could be:
where , , . Note that the probability assignment of the set of is not a BPA, because the elements in the set are not mutually exclusive. Due to , the information is incomplete. This example has shown the differences between BPA and D numbers.
If a problem domain is , where and if , a special form of D numbers can be expressed by:
or simply denoted as , where and . Some properties of D numbers are introduced as follows.
Relative matrix. linguistic constants expressed in normal triangular fuzzy numbers are illustrated in Fig. 2. The area of intersection and union between any two triangular fuzzy numbers and can be can calculated to represent the nonexclusive degree between two D numbers. For example, the intersection and the union in Fig. 2. The nonexclusive degree can be calculated as follows:
(4) 
It should be emphasized that how to determine the nonexclusive degree depends on the application type. Due to the characteristic of the fuzzy numbers, we choose the area of intersection and union between two fuzzy numbers. A relative Matrix for these elements based on the nonexclusive degree can be build as below:
(5) 
Exclusive coefficient. The exclusive coefficient is used to characterize the exclusive degree of the propositions in a assessment situation, which is got by calculating the average nonexclusive degree of these elements using the upper triangular of the relative matrix. Namely:
(6) 
where is the number of the propositions in the assessment situation. Smaller the is, the more exclusive the propositions of the application are. When , the propositions of application are completely mutually exclusive. That is, this situation is up to the requirements of the DempsterShafer evidence theory.
The combination rule of D numbers. Firstly, the given D numbers should be discounted by the exclusive coefficient , which can guarantee the elements in the frame of discernment to be exclusive. The D numbers can be discounted as below:
(7) 
where is the elements in .
Then the combination rule of D numbers based on the exclusive coefficient is illustrated as follows.
(8) 
with
(9) 
where is a normalization constant, called conflict because it measures the degree of conflict between and .
One should note that, if , i.e, the elements in the frame of discernment are completely mutually exclusive, the D numbers will not be discount by the exclusive coefficient. That is, the mutually exclusive situation of D numbers is completely the same with the DempsterShafer evidence theory.
3 Proposed model based on D numbers
In this section, the proposed model based on D number will be illustrated with a numerical example. In the model, the exclusive coefficient is proposed to represent the exclusive degree among the propositions in the frame of the discernment. From section 2, we know that, one of the advantages of D numbers is that the elements in the frame of the discernment are not required to be mutually exclusive. It’s clear that propositions of application in real world can’t be completely mutually exclusive, so define the exclusive coefficient to undermine the nonexclusive property is essential. For example, linguistic assessment based on human judgment can be "fairly good", "good" and "very good", which is obviously nonexclusive but need to be assessed. A numerical example to show the proposed method is illustrated through a stepbystep description.
Step 1 Constructing the linguistic constants deng2011risk () expressed in positive trapezoidal fuzzy numbers. The details of linguistic constants presented in Fig. 1 are shown in Table 1.
Variable linguistic constants  Fuzzy numbers 

Low  (0.04,0.1,0.18,0.23) 
Fairly low  (0.17,0.22,0.36,0.42) 
Medium  (0.32,0.41,0.58,0.65) 
Fairly high  (0.58,0.63,0.80,0.86) 
High  (0.72,0.78,0.92,0.97) 
Step 2 Calculate the area of intersection and the union between any two trapezoidal fuzzy numbers and respectively. For example, and in the Fig. 1 is some intersections.
Step 3 Calculate the nonexclusive degree between two fuzzy numbers according to Eq. 4. For example, the nonexclusive degree between "low" and "fairly low" can be calculated as follows:
Step 4 Build a relative Matrix following the regulation defined in Section 2. The relative Matrix of current example is:
(10) 
Step 5 Calculate the exclusive coefficient through Eq. 6. That is:
Step 6 Discounting the D numbers according to Eq. 7. Two D number and based on the fuzzy numbers in Table 1 are shown in Table 2. Then we can discount these two D numbers using the exclusive coefficient , the results are also in Table 2.
0.12  0.1  0.115  0.096  

0.7  0.06  0.671  0.057  
0.02  0.6  0.019  0.575  
0.1  0.2  0.096  0.192  
0.06  0.04  0.057  0.38  
0  0  0.042  0.042 
Step 7 Use the combination rule of D numbers given in Section 2 to get the final assessment. Namely:
4 Estimating risk of contaminant intrusion
Contaminant intrusion in a water distribution network is a complex phenomenon, which depends on three elements C a pathway, a driving force and a contamination source lindley2001framework (); lindley2002assessing (). However, the data on these elements are generally incomplete, nonspecific and uncertain. In earlier studies, evidential reasoning model has been used to estimate risk of contaminant intrusion in water distribution network based on above three elements sadiq2006estimating (); deng2011modeling (). This section provides another methods called D numbers to assess the risk of contaminant intrusion in distribution networks.
In previous work sadiq2006estimating (); deng2011modeling (), the problem domain of risk of an intrusion can be described by a universal set , in which ‘P’ denotes ‘possible’ and ‘NP’ denotes ‘not possible’ intrude. The power set of the risk of intrusion consists of two singletons and, , a universal set and the empty set . As described earlier, the risk of contaminant intrusion can be evaluated based on three elements ¨C a pathway (), a driving force (), and a contamination source ().
We select surrogate measures to simplify the intrusion problem. The breakage rate (# of breaks/100 km/year) is taken as a surrogate measure for an "intrusion pathway", transient pressure(psi) is taken as a surrogate for a "driving force", and the separation distance (meters) between a contaminant source and a water main as a surrogate measure for the "source of contamination". The frame of discernment are mapped to obtain D numbers(i.e., ), where each of them can be assigned to these subsets P, P, NP, and NP.
For this problem, we propose a new contaminant intrusion model based on D numbers, which make use of the exclusive coefficient to represent the exclusive degree among the propositions of the water distribute networks. A stepbystep description is provided below:
Step 1 Construct the description of propositions and collect data. For example, in Fig. 3, the description of each proposition is represented as trapezoidal fuzzy numbers. The collected data are real numbers. It should be emphasized here that triangular fuzzy numbers and real numbers are the special cases of generalized trapezoidal fuzzy numbers. This step is the same as in case of sadiq2006estimating ().
Step 2 Identify D numbers of these three evidence bodies respectively. We first choose a scenario from Fig. 3, in which the following bodies of evidence are observed:
Step 3 Following steps in section 3 to calculate the exclusive coefficient of these evidence bodies respectively. Results are shown as follows.
Evidence  value  

Pathway,  10  0.1195  0  0.869  0.131 
(# bks/100 km/year)  
Pressure, (psi)  0  0.1057  0  0  1 
Contaminant source,  3  0.131  0.869  0  0.131 
(m) 
Step 5 Use the combination rule of D numbers (Eq. 8) to obtain integrated D numbers of risk of contaminant intrusion at a given location in a distribution network. For example, the integrated D numbers of this scenario for risk assessment is:
To illustrate the efficiency of the proposed method, we choose five additional scenarios as sadiq2006estimating () do. The results obtained through both methods are illustrated in Table 4.
Scenario  Method  Risk  

1  10  0  3  Sadiq et al. (2006)  
Proposed  (0.44, 0.07, 0.49)  
2  10  0  20  Sadiq et al. (2006)  
Proposed  (0, 0.1195, 0.8805)  
3  10  50  20  Sadiq et al. (2006)  
Proposed  (0, 0.013, 0.987)  
4  30  50  20  Sadiq et al. (2006)  
Proposed  (0,0.11,0.89)  
5  30  50  3  Sadiq et al. (2006)  
Proposed  (0.41,0.06,0.53)  
6  30  20  3  S  
Proposed  (0.98,0.02,0) 

e1: Breaks (# bks/100 km/year)
e2: Pressure (psi)
e3: Separation distance (m)
The proposed model make the assessment based on D numbers discounted by the exclusive coefficient, which reflects the exclusive degree of these propositions. While the method in sadiq2006estimating () use the DempsterShafer evidence theory based on the mutually exclusive hypothesis between the propositions, which is impractical in the realword applications. From the results, we can see that, the incompleteness of the result on the bias of exclusive coefficient is much smaller than that of sadiq2006estimating (). The proposed method not only allow the propositions of application to be nonexclusive, but also give a more effective assessment.
5 Conclusion
One of the assumptions to apply the DempsterShafer evidence evidence theory is that all the elements in the frame of discernment should be mutually exclusive. However, it is difficult to meet the requirement in the realworld applications. In this paper, a new mathematic tool to model uncertain information, called as D numbers, is used to model and combine the domain experts’ opinions under the condition that the linguistic constants are not exclusive with each other. An exclusive coefficient is proposed to discount the D numbers. After the discounted D numbers are obtained, the domain experts’ opinion can be fused based on our proposed combination rule of D numbers. The application to estimate the risk of the contamination intrusion of the water distribution network illustrates the efficiency of our proposed D numbers method.
Acknowledgements
The work is partially supported by National Natural Science Foundation of China (Grant No. 61174022), R&D Program of China (2012BAH07B01), National High Technology Research and Development Program of China (863 Program) (Grant Nos. 2013AA013801 and 2012AA041101), Chongqing Natural Science Foundation (for Distinguished Young Scholars) (Grant No. CSCT,2010BA2003), Doctor Funding of Southwest University (Grant No. SWU110021), China State Key Laboratory of Virtual Reality Technology and Systems.
References
References
 (1) A. Sargaonkar, S. Kamble, R. Rao, Model study for rehabilitation planning of water supply network, Computers, Environment and Urban Systems 39 (2013) 172–181.
 (2) S. NyendeByakika, G. NgiraneKatashaya, J. M. Ndambuki, Application of hydraulic modelling to control intrusion into potable water pipelines, Urban Water Journal 10 (3) (2013) 216–219.
 (3) K. Vairavamoorthy, J. Yan, S. Gorantiwar, Modelling the risk of contaminant intrusion in water mains, Proceedings of the ICEWater Management 160 (2) (2007) 123–132.
 (4) A. Preis, A. Ostfeld, Multiobjective contaminant response modeling for water distribution systems security, Journal of Hydroinformatics 10 (4) (2008) 267–274.
 (5) S. Tamminen, H. Ramos, D. Covas, Water supply system performance for different pipe materials part I: water quality analysis, Water resources management 22 (11) (2008) 1579–1607.
 (6) L. Perelman, A. Ostfeld, Extreme impact contamination events sampling for water distribution systems security, Journal of Water Resources Planning and Management 136 (1) (2009) 80–87.
 (7) M. S. Islam, R. Sadiq, M. J. Rodriguez, H. Najjaran, A. Francisque, M. Hoorfar, Evaluating water quality failure potential in water distribution systems: a fuzzytopsisowabased methodology, Water resources management 27 (7) (2013) 2195–2216.
 (8) I. ElBaroudy, S. P. Simonovic, Fuzzy criteria for the evaluation of water resource systems performance, Water resources research 40 (10).
 (9) T. M. Walski, D. V. Chase, D. A. Savic, W. M. Grayman, S. Beckwith, E. Koelle, et al., Advanced water distribution modeling and management, Haestad press, 2003.
 (10) N. Islam, R. Sadiq, M. J. Rodriguez, A. Francisque, Evaluation of source water protection strategies: A fuzzybased model, Journal of environmental management 121 (2013) 191–201.
 (11) K. Xin, T. Tao, Y. Wang, S. Liu, Hazard and vulnerability evaluation of water distribution system in cases of contamination intrusion accidents, Frontiers of Environmental Science & Engineering 6 (6) (2012) 839–848.
 (12) N. Khanal, S. G. Buchberger, S. A. McKenna, Distribution system contamination events: exposure, influence, and sensitivity, Journal of water resources planning and management 132 (4) (2006) 283–292.
 (13) T. R. Lindley, A framework to protect water distribution systems against potential intrusions, Ph.D. thesis, University of Cincinnati (2001).
 (14) T. R. Lindley, S. G. Buchberger, et al., Assessing intrusion susceptibility in distribution systems, JournalAmerican Water Works Association 94 (6) (2002) 66–79.
 (15) A. Rasekh, K. Brumbelow, Drinking water distribution systems contamination management to reduce public health impacts and system service interruptions, Environmental Modelling & Software 51 (2014) 12–25.
 (16) H. Shen, E. McBean, False negative/positive issues in contaminant source identification for waterdistribution systems, Journal of Water Resources Planning and Management 138 (3) (2011) 230–236.
 (17) A. Preis, A. Ostfeld, A contamination source identification model for water distribution system security, Engineering optimization 39 (8) (2007) 941–947.
 (18) R. Sadiq, E. SaintMartin, Y. Kleiner, Predicting risk of water quality failures in distribution networks under uncertainties using faulttree analysis, Urban Water Journal 5 (4) (2008) 287–304.
 (19) L. A. Zadeh, Review of a mathematical theory of evidence, AI magazine 5 (3) (1984) 81.
 (20) R. Sadiq, Y. Kleiner, B. Rajani, Estimating risk of contaminant intrusion in water distribution networks using dempster–shafer theory of evidence, Civil Engineering and Environmental Systems 23 (3) (2006) 129–141.
 (21) E. Jenelius, J. Westin, Å. J. Holmgren, Critical infrastructure protection under imperfect attacker perception, International Journal of Critical Infrastructure Protection 3 (1) (2010) 16–26.
 (22) L. A. Zadeh, Fuzzy sets, Information and control 8 (3) (1965) 338–353.
 (23) Y. Deng, Y. Chen, Y. Zhang, S. Mahadevan, Fuzzy dijkstra algorithm for shortest path problem under uncertain environment, Applied Soft Computing 12 (3) (2012) 1231–1237.
 (24) X. Zhang, Y. Deng, F. T. Chan, P. Xu, S. Mahadevan, Y. Hu, IFSJSP: A novel methodology for the jobshop scheduling problem based on intuitionistic fuzzy sets, International Journal of Production Research 51 (17) (2013) 5100–5119.
 (25) E. Aghaarabi, F. Aminravan, R. Sadiq, M. Hoorfar, M. Rodriguez, H. Najjaran, Comparative study of fuzzy evidential reasoning and fuzzy rulebased approaches: an illustration for water quality assessment in distribution networks, Stochastic Environmental Research and Risk Assessment 28 (3) (2014) 655–679.
 (26) R. Setola, S. De Porcellinis, M. Sforna, Critical infrastructure dependency assessment using the input–output inoperability model, International Journal of Critical Infrastructure Protection 2 (4) (2009) 170–178.
 (27) G. Oliva, S. Panzieri, R. Setola, Fuzzy dynamic input–output inoperability model, International Journal of Critical Infrastructure Protection 4 (3) (2011) 165–175.
 (28) Z. Pawlak, A. Skowron, Rudiments of rough sets, Information sciences 177 (1) (2007) 3–27.
 (29) Z. Pawlak, A. Skowron, Rough sets: some extensions, Information sciences 177 (1) (2007) 28–40.
 (30) A. P. Dempster, Upper and lower probabilities induced by a multivalued mapping, The annals of mathematical statistics (1967) 325–339.
 (31) G. Shafer, A mathematical theory of evidence, Vol. 1, Princeton university press Princeton, 1976.
 (32) D. Wei, X. Deng, X. Zhang, Y. Deng, S. Mahadevan, Identifying influential nodes in weighted networks based on evidence theory, Physica A: Statistical Mechanics and its Applications 392 (10) (2013) 2564–2575.
 (33) X. Deng, Y. Hu, Y. Deng, S. Mahadevan, Environmental impact assessment based on D numbers, Expert Systems with Applications 41 (2) (2014) 635–643.
 (34) S. Huang, X. Su, Y. Hu, S. Mahadevan, Y. Deng, A new decisionmaking method by incomplete preferences based on evidence distance, KnowledgeBased Systems (56) (2014) 264–272.
 (35) H. Li, Hierarchical risk assessment of water supply systems, Ph.D. thesis (2007).
 (36) M. Weickgenannt, Z. Kapelan, M. Blokker, D. A. Savic, Riskbased sensor placement for contaminant detection in water distribution systems, Journal of Water Resources Planning and Management 136 (6) (2010) 629–636.
 (37) R. Sadiq, Y. Kleiner, B. Rajani, Aggregative risk analysis for water quality failure in distribution networks., Journal of Water Supply: Research & TechnologyAQUA 53 (4) (2004) 241–261.
 (38) Y. Deng, W. Jiang, R. Sadiq, Modeling contaminant intrusion in water distribution networks: A new similaritybased dst method, Expert Systems with Applications 38 (1) (2011) 571–578.
 (39) Y. Deng, D numbers: Theory and applications, Journal of Information and Computational Science 9 (9) (2012) 2421–2428.
 (40) X. Deng, Y. Hu, Y. Deng, S. Mahadevan, Supplier selection using AHP methodology extended by D numbers, Expert Systems with Applications 41 (1) (2014) 156–167.
 (41) X. Deng, Y. Hu, Y. Deng, Bridge condition assessment using D numbers, The Scientific World Journal 2014 (2014) Article ID 358057, 11 pages. doi:10.1155/2014/358057.
 (42) T. Tao, Y.j. Lu, X. Fu, K.l. Xin, Identification of sources of pollution and contamination in water distribution networks based on pattern recognition, Journal of Zhejiang University SCIENCE A 13 (7) (2012) 559–570.
 (43) Y. Deng, R. Sadiq, W. Jiang, S. Tesfamariam, Risk analysis in a linguistic environment: a fuzzy evidential reasoningbased approach, Expert Systems with Applications 38 (12) (2011) 15438–15446.