Page 1 of 15
Transactions on Engineering and Computing Sciences - Vol. 11, No. 3
Publication Date: June 25, 2023
DOI:10.14738/tecs.113.14741.
Karimanzira, D., & Ritzau, L. (2023). Event Detection in Groundwater Sensor Networks using Artificial Intelligence. Transactions on
Engineering and Computing Sciences, 11(3). 48-62.
Services for Science and Education – United Kingdom
Event Detection in Groundwater Sensor Networks using Artificial
Intelligence
Divas Karimanzira
Fraunhofer Institute of Optronics,
Sytem Technologies and Image Exploitation (IOSB),
Am Vogelherd 90, 98693 Ilmenau, Germany
Linda Ritzau
Fraunhofer Institute of Optronics,
Sytem Technologies and Image Exploitation (IOSB),
Am Vogelherd 90, 98693 Ilmenau, Germany
ABSTRACT
In this paper, a method for detecting spatial and temporal anomalous events in
groundwater sensor networks (high-dimensional time series data), such as system
faults and attacks will be developed. Unlike recently developed deep learning
frameworks for anomaly detection which do not consider the dependences between
the variables and apply the existing relationships to predict the expected behavior
of the sensors, the method in this paper extracts the relationships between the
sensors spatially and temporally and learn to detect and simultaneously explain
deviations from these relationships. This challenge is solved by using graph
attention neural networks and structured learning. Attention neural networks can
give useful interpretability in context of the anomalies detected and allows to
identify their causes. To improve robustness the method considers aleatoric and
parametric uncertainties by using ensemble specific value prediction and
prediction intervals without assuming any data distribution. Furthermore, the
model was connected to a fully connected classifier to classify typical groundwater
network anomalies. The method was applied to a study area and it could be shown
that the method could capture in 92% of the cases the complex correlations
between the high dimensional variables, and enabled analysts to identify the causes
of the anomalies.
Keywords: Deep learning, Event Detection, Graph Attention Neural Networks, Prediction
Intervals, Uncertainty consideration.
INTRODUCTION
In the digitization era, for many communities, groundwater systems are equipped large sensor
networks with several sensor nodes. These sensor nodes produce a large amount of data in
form of time series of different variables. In groundwater, the nodes include sensors for
measuring different parameters such as groundwater flow rates, hydraulic conductivity, water
level, water quality, and etc. In such a network, the sensor nodes, the sensors themselves can
be related in a complex, nonlinear manner. A good example of such a complex and nonlinear
cause-effect relationship is the effect of dissolved oxygen. It has substantial effects on the water
Page 2 of 15
49
Karimanzira, D., & Ritzau, L. (2023). Event Detection in Groundwater Sensor Networks using Artificial Intelligence. Transactions on Engineering and
Computing Sciences, 11(3). 48-62.
URL: http://dx.doi.org/10.14738/tecs.113.14741
quality regulating the valence state of trace metals and by constraining the bacterial
metabolism of dissolved organic species [1]. Such high complexity and dimensionality cannot
be easily tackled manually and timely by humans and hence requires automated methods for
detecting anomalous events and allow interpretability of the results for water operators to
react for action as soon as possible. The curse of dimensionality is associated in anomaly
detection with the issue that the amount of data required for generalizability of the models
increases with the number of features, which result in isolated and sparse data points which
can conceal the true events [2].
Groundwater monitoring process, poses many challenges to anomaly detection algorithms. The
measured signals have a mean which varies over time as the groundwater monitoring process
is non-stationary. Different anomaly patterns are possible as illustrated in Figure 1. One
challenge in anomaly detection is the lack of training samples with labels for supervised
learning. Therefore, a series of methods based on unsupervised learning have been developed.
These include methods such as distance-based, density-based, and clustering-based techniques
[3]. The application of these methods with their ability to capturing only linear relationships is
only limited as most of the real-world problems are highly nonlinear in nature. New methods
which consider nonlinearity and high dimensionality are based on deep learning. For example,
Aggarwal developes an Autoencoder (AE) to predict outliers [4]. For this purpose, the AE uses
reconstruction error as an outlier score. Generative Adversarial Networks with their
generation and discriminating model can generate realistic-looking data by sampling from a
learned data distribution. This model has been utilized by many authors for anomaly detection,
e.g., [5] and [6] More recently, [7] applied an combinbation of AEs and Generative Adversarial
Networks (GANs) for anomaly detection. With its ability to extract temporal dependencies the
Long Short Term Memory Neural Network (LSTM) in form of AEs have found so much
application in multivariate time series anomaly detection ([8], [9], [10] ). Unfortunately, most
of these methods do not have the capability to explicitly learn the spatial relationships between
the sensor nodes and also the dependencies between the sensors, and therefore cannot
sufficiently model and explain many potential contexts and interrelationships between the
sensors [11],[12].
Recently, in [13],[14] graph neural networks (GNNs) have been applied successfully on learning
graph-structured data. However, they cannot be applied directly to multivariate anomaly
detection, because classically GNNs model each node with the same parameters [13] but in
reality, every feature (sensor) is measured differently and therefore different features have
different behaviours. Also, typical GNNs assume a-priori knowledge of the adjacency matrix
from the graph, but in a groundwater, sensor network the relations to create an Adjacency
matrix are not known in a-priori. In their paper on graph deviation network and test on WADI
an SMAT Dataset, Deng et al. could show that GDN can detect anomalies and explain them very
well [11]. Chao Fang et al, extended the GDN model to consider model uncertainties assuming
gaussian data distribution [15].
Therefore, in this paper we will develop a method based on graph neural network and
structured learning for multivariate spatial and temporal anomaly detection. The model will be
equipped with attention mechanism for interpretability as attention weights can give
Page 3 of 15
50
Transactions on Engineering and Computing Sciences (TECS) Vol 11, Issue 3, June - 2023
Services for Science and Education – United Kingdom
information about the relationship of the features to one another. Furthermore, to overcome
the problem of knowing the adjacency matrix in apriori as in the typical GNNs, the adjacency
matrix will be learnt along during the model training as in [16], [11],[12]. We will improve the
robustness by capturing aleatoric and parametric uncertainties of the anomaly detection model
by considering ensemble prediction of specific value and prediction intervals as in [17]. The
method does not assume any data distribution as in [15].
The main contribution of the work is twofold. The method based on graph attention neural
networks can learn spatial and temporal relationships between different sensors and sensor
nodes. The GAT does not require a-priori knowledge of the adjacency matrix. Determining the
adjacency matrix apriori is a major issue in real world applications as it is influenced by many
factors. It learns the adjacency matrix along with the model training. Furthermore, with the
attention mechanism the relations between the features and the deviations from the normal
state can be explained. The method also captures aleatoric and parametric uncertainties
making the model robust for anomaly detection. Furthermore, the model can classify typical
groundwater anomalies, which is very useful for water managers and environmental
authorities.
MATERIALS AND METHODS
Study Area
The study area has several groundwater sensor nodes with different sensors for water level
(WL) temperature (T), nitrates (NO3) measurements and others. For the demonstration and to
illustrate the performance of the models, fours sensors nodes are selected. At each node, the
timeseries of WL, T, NO3 are obtained which makes a total of twelve features. Data from 2019
to 2020 was used for experiments, whereby 60% was used for training, 20% for validation and
the rest for testing. The data shows some typical groundwater anomalies and events such as
trends or level changes.
Model Description
Sensor entities have different characteristic and their relationships in a groundwater sensor
network can be complex, for example sensors measuring water level, water flow rates and
water quality in a sensor node can behave similarly and it is also plausible that water quality
sensors in different nodes can behave similarly. To capture all this, two components of the
model are required 1) for modeling the sensor characteristics using embedding vectors and 2)
a graphical representation of the sensor network.
Embedding Vector:
The first component is for capturing the characteristic behavior of each sensor, which can be
accomplished by an embedding vector vi of length d for each sensor i. The vectors vi with the
unique characteristics of the sensor will be used for learning to determine the sensor
relationships in form of a graph and also in the attention mechanism. The elements of the
vectors vi will be adapted during training of the model after being initialized randomly.