Effectively Positioning Water Loss Event in Smart Water Networks

With the eye-catching advances in sensing technologies, smart water networks have been attracting immense research interest in recent years. One of the most overarching tasks in smart water network management is the reduction of water loss (such as leaks and bursts in a pipe network). In this paper, we propose an efficient scheme to position water loss event based on water network topology. The state-of-the-art approach to this problem, however, utilizes the limited topology information of the water network, that is, only one single shortest path between two sensor locations. Consequently, the accuracy of positioning water loss events is still less desirable. To resolve this problem, our scheme consists of two key ingredients: First, we design a novel graph topology-based measure, which can recursively quantify the “average distances” for all pairs of senor locations simultaneously in a water network. This measure will substantially improve the accuracy of our positioning strategy, by capturing the entire water network topology information between every two sensor locations, yet without any sacrifice of computational efficiency. Then, we devise an efficient search algorithm that combines the “average distances” with the difference in the arrival times of the pressure variations detected at sensor locations. The viable experimental evaluations on real-world test bed (WaterWiSe@SG) demonstrate that our proposed positioning scheme can identify water loss event more accurately than the best-known competitor.


Introduction
The advent of sensing technologies in water supply systems has led to an increasing need for the development of smart data technologies in water resource management.Today, water loss has become a serious problem for almost all urban areas around the world [1], and it can be even worse in areas with scarcity of water.As a statistical example, the water industry in England and Wales loses 3.36 billion liters of water a day in leaks [2].If those leaking locations were found as early as possible, sufficient water resources could be saved to supply 22.4 million people.However, it is often difficult to position such water loss events accurately as (a) the supply pipe is usually buried at least 3 feet below the ground surface, and (b) there are typically many paths connected by pipe sections between two pipe junctions.Therefore, it is imperative for us to devise an efficient model that can position water loss event automatically and accurately in a real water supply system.

Prior Work
Over the last decade, there have been several pioneering approaches proposed for water leak or burst localization, such as gradient intersection methods [3,4], wave propagation analysis [5], spectral clustering [6], and multiple hypotheses testing [7] (see [8] for a survey).Nonetheless, only a paucity of methods have been proposed in the context of a water network structure that explores graph topology.
One excellent piece of work is due to Misiunas et al. [9] who leveraged a search-based technique to localize a burst point.Its main idea consists of two phases: in the first phase, the search is performed globally over all nodes in the network; in the second phase, if the burst is inferred to have occurred along the pipe, extra nodes are placed along each of the pipes and the global search is repeated.However, both steps of this method require to perform a global search over all sensor locations.Hence, its computational efficiency is cost-inhibitive especially when a water network has high density of nodes.
Recently, Srirangarajan et al. [10] proposed an interesting technique that utilizes wave-based multiscale analysis of the pressure signal to detect burst transients.To identify the location of water burst events, they also exploited the Dijkstra's algorithm [11] for calculating the shortest distance between every two sensor locations.Nevertheless, we observe that, when a burst occurs, its wave may travel in all the possible directions of the paths (rather than only the paths with the shortest distance) from the burst location to the measurement points.Thus, in order to accurately position water loss events, it seems not appropriate to rely on only the shortest travel time between every two sensor locations.

Our Contributions
To resolve the above limitations, in this paper, we propose an efficient scheme that can position water loss event more accurately based on water network topology.Our main contributions can be summarized as follows: • We first devise a novel graph topology-based measure, which can recursively quantify the "average distance" between every two senor locations simultaneously in a water network.This measure can significantly improve the accuracy of positioning water loss events, in that it can capture the multi-faceted relationships among senor locations in a global manner, yet without any sacrifice of computational efficiency.(Section 2.1) • We next propose a fast and accurate search algorithm to efficiently position water loss events, which utilizes our "average distance" measure to determine the difference in the arrival times of the pressure variations detected at sensor locations.(Section 2. 2) The viable experimental evaluations on a real-world test bed demonstrate that our proposed scheme can identify water loss event more accurately than the state-of-the-art competitor.(Section 3)

The Proposed Model for Positioning Water Loss Event
We first devise a novel graph topology-based measure that can effectively quantify the "average distance" between senor locations, and then propose our search algorithm to position water loss events.

A Graph Topology-Based Measure
A water network can be modelled as a graph.Let G = (V J ∪V S , E, A) be an attributed water network, where V J is a vertex set of pipe junctions, V S is a vertex set of deployed sensor locations, E denotes an edge set of pipe sections connecting two vertices, and A carries the length of each pipe section.
the length of pipe section (u, v), if u = v and ∃ pipe section (u, v) ∈ E; 0, otherwise.
The adjacency matrix of G, denoted as A, is defined by Example 1.Consider the water network G in Figure 1, whose edge weights carry the length of each pipe section.By Definition 1, its distance matrix D and adjacency matrix A are as follows: Based on Definition 1, we notice that D and A are both symmetric matrices.Leveraging D and A, we are now ready to determine the "average distance" between every two sensor locations on graph G.
Let us first introduce a |V | × |V | matrix, W (d) , whose element [W (d) ] (u,v) denotes the "average distance" of all paths with d hops between vertices u and v.Then, [W (d) ] (u,v) can be represented as the sum of the pipe section lengths over all paths with d hops between vertices u and v the number of the paths with d hops between vertices u and v .
To obtain the denominator of Equation ( 1), we can directly use an elegant property in graph theory about the power of an adjacency matrix: the (u, v)-th element of the d-th power of A, that is, [A d ] (u,v) , counts the number of the paths with d hops between vertices u and v.
However, it is not easy to evaluate the nominator of Equation (1) as the power of a distance matrix can only evaluate the product (instead of sum) of the pipe section lengths over all paths.As an example, in Figure 1, to determine the sum of the pipe section lengths over all paths with 2 hops between vertices d and g, the result of [D 2 ] (d,g) would produce the product of the pipe section lengths as follows: We notice that, if the "×" sign in Equation ( 2) were changed into "+" sign, the result would desirably turn into the sum of the pipe section lengths over all paths (d → b → g and d → h → g) with 2 hops between vertices d and g.To obtain the correct "+"-based results, can we still take good advantage of the power of a distance matrix while changing its "×" sign (in Equation (2)) into "+" sign ?
To address this question, our technique is to introduce an element-wise operator exp( * ).We construct the element-wise exponential distance matrix, denoted as exp(tD), as follows: where t ∈ R denotes an arbitrary scalar.
Intuitively, the matrix exp(tD) is formed by replacing every nonzero element in D, say x, with e x , and keeping the zero elements of D unchanged.
Theorem 1.For any positive integer N = 1, 2, • • • , the following equation holds: As a special case when N = 2, Theorem 1 reduces to the result in Equation ( 4).Theorem 1 is used for generalizing the result of Equation (3) for any arbitrary element of (exp(tD)) k .More specifically, in our aforementioned example, we choose Equation (4) (that is, N = 2 in Equation ( 6)) to "inverse" [(exp(tD)) 2 ] (d,g) because there are two summands (e (6+7)t and e (5+4)t ) in Equation (3).In general case, we observe that the number of summands for arbitrary element (u, v) of (exp(tD)) k in Equation ( 3) should be consistent with (a) the choice of N in Equation ( 6) and (b) the number of the paths with d hops between vertices u and v (that is, Hence, the sum of the pipe section lengths over all paths with 3 hops between vertices b and i is 44. After the nominator of Equation ( 1) is obtained, the "average distance" [W (d) ] (u,v) follows directly: Theorem 2. The "average distance" of all paths with d hops between every two vertices u and v, [W (d) ] (u,v) , can be quantified as As a special case, W (1) = D.This is because, when d = 1 and u Example 3. Recall the result in Example 2. Since [A 3 ] (b,i) = 3 and the sum of the pipe section lengths over all paths with d = 3 hops between vertices b and i is 44, the "average distance" is [W (3) ] (b,i) = 44/3.
Theorem 2 provides an efficient way of evaluating the "average distance" [W (d) ] (u,v) with the fixed number d of hops by using distance matrix D and adjacency matrix A. Based on [W (d) ] (u,v) , we can obtain the "average distance" matrix S (L) within L hops as follows.
Intuitively, [S (L) ] u,v captures the weighted average distance within L hops between vertices u and v.In Equation (7), the first term λD signifies that the paths of 1 hop have a contribution of λ to S (L) ; the second term λ 2 W (2) means that the paths of (longer) 2 hops have a (smaller) contribution of λ 2 to S (L) , and so forth.The parameter 1  β is a normalization factor, which guarantees that the sum of all the weighted factors {λ, λ 2 , • • • , λ L } in Equation ( 7) is 1.
The constant λ is between 0 and 1, which can be thought of as a confidence level.Empirically, it is often set to 0.6-0.9, which gives the rate of decay as wave spreads across the pipe sections.
Example 4. Recall the water network in Figure 1 and its distance matrix D and adjacency matrix A in Example 1.We choose λ = 0.85 and L = 5.By Definition 2, the "average distance" matrix S (5) can be obtained as follows: As opposed to the previous work [10] that considers only one single path of the shortest length, S (L) can capture multiple paths of different length between every two sensor locations by fully exploiting the network topology information.Thus, if the "average distance" S (L) is used to quantify the wave traveling distance from a burst location to a sensor location, water loss events can be positioned more accurately, as will be shown in the next section.

Effectively Positioning Water Loss Event
Having evaluated the "average distance" matrix S (L) , we next present an efficient algorithm to position a water loss event with higher accuracy.We assume that the sensor points of the water network are time synchronized.Our basic idea is to measure the difference in "average distance" to two sensor locations that detect the burst transient at known times.Specifically, let ν denote the average wave speed, and let t u and t v be the time points when the burst transient event is detected at sensor locations u and v, respectively.Note that the time of the burst event t x is unknown in advance, but such a burst event must occur before min{t u , t v } (earlier than either of the detected time at locations u and v).We observe that the time gap between (t u − t x ) and (t v − t x ) (which can be calculated as |t u − t v |) is mainly due to the difference in "average distance" from the burst (source) location x to both sensor locations u and v. Hence, ideally we have the following equations: Then, we can enumerate each sensor location in V to find out the top-k (k is often set to 3-5 in practice) best approximate solutions X ⊆ V of x to Equation (8), that is, Thus, the elements in X form a "hyperbolic curve" with two focal points u and v.To determine the precise location along this "hyperbolic curve", we need to choose another pair of sensor locations, say u and w, as two focal points, with the aim to produce the another "hyperbolic curve", that is, to find out another set of the top-k best approximate solutions Ŷ ⊆ V to the following equation: The intersection of the two "hyperbolic curves" X ∩ Ŷ will produce a small number of possible locations where a water loss event may occur.Finally, we can search locally for the most likely water loss position along pipe sections connected to the closest sensor locations in X ∩ Ŷ .

Experimental Study
In this section, we experimentally demonstrate the effectiveness of our water loss positioning scheme on the real test bed (WaterWiSe@SG) deployed on the water network system by Whittle et al. [12].
The test bed consists of sensors measuring hydraulic (pressure, flow) and water quality parameters.The pipe network layout is depicted in Figure 2     Ten burst events are created during the evening from 21:00 to 23:00 hours.The results are reported in Table 1.For each burst event, we compute the arrival time difference for every pair of sensor locations.To estimate its burst location, we compare the localization errors of our proposed scheme with those of Srirangarajan et al.'s shortest distance-based method [10].It can be discerned that, for every burst event, our method consistently exhibits 13.5%-62.7%higher accuracy than Srirangarajan et al.'s.The average error of our water loss positioning method is 28.25 meters, which has 36.57%improvement over the Srirangarajan et al.'s approach.This is because our graph-based topology distance measure can comprehensively take into account the weighted contributions of paths with different hops between two sensor locations, whereas Srirangarajan et al.'s distance measure accommodates only one path of the shortest length in a biased manner.In addition, our techniques can produce the top-k (k = 5) best approximate solutions along a "hyperbolic curve", thus producing a better candidate set for local search.
Notice that 2 out of 10 burst events are not positioned, denoted as "-" in Events 7 and 9 of Table 1, due to the missing reading of sensors.Thus, in the above experiment, the percentage of burst events positioned by this method is ∼80%.Ideally, this percentage can be improved further if the sensors readings are good enough.
Currently, our algorithm is highly efficient to position burst events rather than long-term leakage, as the detection algorithm we adopted is based on a rate of sudden change criterion.

Conclusions
In this paper, an efficient scheme has been investigated to position water loss event more accurately by taking advantage of the water network topology.First, a novel graph topology-based measure is proposed, which can recursively quantify the "average distances" between every two senor locations simultaneously in a water network.Then, based on this measure, an efficient search algorithm is devised, which can integrate our "average distances" measure with the difference in the arrival times of the pressure variations detected at sensor locations.The viable experimental study on real-life test bed (WaterWiSe@SG) demonstrates that our proposed positioning scheme can position water loss event more reliably with an improvement of up to 62.7% accuracy over the best-known algorithm.
For future work, we aim to develop optimization techniques that can substantially accelerate the computation of our proposed scheme, aiming to position water loss events very quickly on a large-scale water supply system.Another interesting problem is to reduce its memory usage.We will incorporate some of our preliminarily results on graph analysis [14][15][16][17][18][19] into the water flow and pressure behavior, to achieve the scalability of our proposed algorithm.

Figure 1 .Definition 1 .
Figure 1.Modelling a water network (left) as a weighted graph (right) based on topology
covering an area of 1km 2 .It consists of |V | = 8 vertices (|V S | = 3 pressure sensors M 1 , M 2 , M 3 that can detect the burst transients, and |V J | = 5 pipe junctions).The measurement points are time synchronized using the GPS pulse per second (PPS) signal leading to a distance error of ±2m [10].To detect burst events, we also implement the CUSUM change detection test by Misiunas et al. [13].The following parameters are used by default: (a) the decay factor λ = 0.6; (b) the total number of hops L = 5; (c) the top-k size k = 3.

Figure 2 .
Figure 2. The real-life pipe network layout (left) and its heterogeneous graph (right), where yellow vertices represent pipe junctions, blue vertices are sensor locations.In the left figure, the green dotted lines denote the wave paths traversed by Srirangarajan et al.'s method [10], whereas both green and red dotted lines are those traversed by our approach.

Table 1 .
Results of Positioning Water Loss Events