TECS-14741 Camera Ready.pdf

Page 1 of 15

Transactions on Engineering and Computing Sciences - Vol. 11, No. 3

Publication Date: June 25, 2023

DOI:10.14738/tecs.113.14741.

Karimanzira, D., & Ritzau, L. (2023). Event Detection in Groundwater Sensor Networks using Artificial Intelligence. Transactions on

Engineering and Computing Sciences, 11(3). 48-62.

Services for Science and Education – United Kingdom

Event Detection in Groundwater Sensor Networks using Artificial

Intelligence

Divas Karimanzira

Fraunhofer Institute of Optronics,

Sytem Technologies and Image Exploitation (IOSB),

Am Vogelherd 90, 98693 Ilmenau, Germany

Linda Ritzau

Fraunhofer Institute of Optronics,

Sytem Technologies and Image Exploitation (IOSB),

Am Vogelherd 90, 98693 Ilmenau, Germany

ABSTRACT

In this paper, a method for detecting spatial and temporal anomalous events in

groundwater sensor networks (high-dimensional time series data), such as system

faults and attacks will be developed. Unlike recently developed deep learning

frameworks for anomaly detection which do not consider the dependences between

the variables and apply the existing relationships to predict the expected behavior

of the sensors, the method in this paper extracts the relationships between the

sensors spatially and temporally and learn to detect and simultaneously explain

deviations from these relationships. This challenge is solved by using graph

attention neural networks and structured learning. Attention neural networks can

give useful interpretability in context of the anomalies detected and allows to

identify their causes. To improve robustness the method considers aleatoric and

parametric uncertainties by using ensemble specific value prediction and

prediction intervals without assuming any data distribution. Furthermore, the

model was connected to a fully connected classifier to classify typical groundwater

network anomalies. The method was applied to a study area and it could be shown

that the method could capture in 92% of the cases the complex correlations

between the high dimensional variables, and enabled analysts to identify the causes

of the anomalies.

Keywords: Deep learning, Event Detection, Graph Attention Neural Networks, Prediction

Intervals, Uncertainty consideration.

INTRODUCTION

In the digitization era, for many communities, groundwater systems are equipped large sensor

networks with several sensor nodes. These sensor nodes produce a large amount of data in

form of time series of different variables. In groundwater, the nodes include sensors for

measuring different parameters such as groundwater flow rates, hydraulic conductivity, water

level, water quality, and etc. In such a network, the sensor nodes, the sensors themselves can

be related in a complex, nonlinear manner. A good example of such a complex and nonlinear

cause-effect relationship is the effect of dissolved oxygen. It has substantial effects on the water

Page 2 of 15

Karimanzira, D., & Ritzau, L. (2023). Event Detection in Groundwater Sensor Networks using Artificial Intelligence. Transactions on Engineering and

Computing Sciences, 11(3). 48-62.

URL: http://dx.doi.org/10.14738/tecs.113.14741

quality regulating the valence state of trace metals and by constraining the bacterial

metabolism of dissolved organic species [1]. Such high complexity and dimensionality cannot

be easily tackled manually and timely by humans and hence requires automated methods for

detecting anomalous events and allow interpretability of the results for water operators to

react for action as soon as possible. The curse of dimensionality is associated in anomaly

detection with the issue that the amount of data required for generalizability of the models

increases with the number of features, which result in isolated and sparse data points which

can conceal the true events [2].

Groundwater monitoring process, poses many challenges to anomaly detection algorithms. The

measured signals have a mean which varies over time as the groundwater monitoring process

is non-stationary. Different anomaly patterns are possible as illustrated in Figure 1. One

challenge in anomaly detection is the lack of training samples with labels for supervised

learning. Therefore, a series of methods based on unsupervised learning have been developed.

These include methods such as distance-based, density-based, and clustering-based techniques

[3]. The application of these methods with their ability to capturing only linear relationships is

only limited as most of the real-world problems are highly nonlinear in nature. New methods

which consider nonlinearity and high dimensionality are based on deep learning. For example,

Aggarwal developes an Autoencoder (AE) to predict outliers [4]. For this purpose, the AE uses

reconstruction error as an outlier score. Generative Adversarial Networks with their

generation and discriminating model can generate realistic-looking data by sampling from a

learned data distribution. This model has been utilized by many authors for anomaly detection,

e.g., [5] and [6] More recently, [7] applied an combinbation of AEs and Generative Adversarial

Networks (GANs) for anomaly detection. With its ability to extract temporal dependencies the

Long Short Term Memory Neural Network (LSTM) in form of AEs have found so much

application in multivariate time series anomaly detection ([8], [9], [10] ). Unfortunately, most

of these methods do not have the capability to explicitly learn the spatial relationships between

the sensor nodes and also the dependencies between the sensors, and therefore cannot

sufficiently model and explain many potential contexts and interrelationships between the

sensors [11],[12].

Recently, in [13],[14] graph neural networks (GNNs) have been applied successfully on learning

graph-structured data. However, they cannot be applied directly to multivariate anomaly

detection, because classically GNNs model each node with the same parameters [13] but in

reality, every feature (sensor) is measured differently and therefore different features have

different behaviours. Also, typical GNNs assume a-priori knowledge of the adjacency matrix

from the graph, but in a groundwater, sensor network the relations to create an Adjacency

matrix are not known in a-priori. In their paper on graph deviation network and test on WADI

an SMAT Dataset, Deng et al. could show that GDN can detect anomalies and explain them very

well [11]. Chao Fang et al, extended the GDN model to consider model uncertainties assuming

gaussian data distribution [15].

Therefore, in this paper we will develop a method based on graph neural network and

structured learning for multivariate spatial and temporal anomaly detection. The model will be

equipped with attention mechanism for interpretability as attention weights can give

Page 3 of 15

Transactions on Engineering and Computing Sciences (TECS) Vol 11, Issue 3, June - 2023

Services for Science and Education – United Kingdom

information about the relationship of the features to one another. Furthermore, to overcome

the problem of knowing the adjacency matrix in apriori as in the typical GNNs, the adjacency

matrix will be learnt along during the model training as in [16], [11],[12]. We will improve the

robustness by capturing aleatoric and parametric uncertainties of the anomaly detection model

by considering ensemble prediction of specific value and prediction intervals as in [17]. The

method does not assume any data distribution as in [15].

The main contribution of the work is twofold. The method based on graph attention neural

networks can learn spatial and temporal relationships between different sensors and sensor

nodes. The GAT does not require a-priori knowledge of the adjacency matrix. Determining the

adjacency matrix apriori is a major issue in real world applications as it is influenced by many

factors. It learns the adjacency matrix along with the model training. Furthermore, with the

attention mechanism the relations between the features and the deviations from the normal

state can be explained. The method also captures aleatoric and parametric uncertainties

making the model robust for anomaly detection. Furthermore, the model can classify typical

groundwater anomalies, which is very useful for water managers and environmental

authorities.

MATERIALS AND METHODS

Study Area

The study area has several groundwater sensor nodes with different sensors for water level

(WL) temperature (T), nitrates (NO3) measurements and others. For the demonstration and to

illustrate the performance of the models, fours sensors nodes are selected. At each node, the

timeseries of WL, T, NO3 are obtained which makes a total of twelve features. Data from 2019

to 2020 was used for experiments, whereby 60% was used for training, 20% for validation and

the rest for testing. The data shows some typical groundwater anomalies and events such as

trends or level changes.

Model Description

Sensor entities have different characteristic and their relationships in a groundwater sensor

network can be complex, for example sensors measuring water level, water flow rates and

water quality in a sensor node can behave similarly and it is also plausible that water quality

sensors in different nodes can behave similarly. To capture all this, two components of the

model are required 1) for modeling the sensor characteristics using embedding vectors and 2)

a graphical representation of the sensor network.

Embedding Vector:

The first component is for capturing the characteristic behavior of each sensor, which can be

accomplished by an embedding vector vi of length d for each sensor i. The vectors vi with the

unique characteristics of the sensor will be used for learning to determine the sensor

relationships in form of a graph and also in the attention mechanism. The elements of the

vectors vi will be adapted during training of the model after being initialized randomly.