Sensor Web, Geoprocessing, Security & GeoRM, Geostatistics, Semantics, 3D, Ilwis, Earth Observation

Statistics for OGC Web services – Midterm report

June 25th, 2015 @ 11:06 by Csaba Lestar
Posted in Communities, GSoC, Sensor Web

Introduction

The project goal is to collect statistics about how the clients use the SOS and WPS deployments. The data is collected in an Elasticsearch database. If you would like to read more about this proposition, please read the project’s first blog post.

Status

After 4 weeks of development, a new statistics module has been integrated into the SOS 4.x project. The following features are currently available:

  • Core, Enhanced, Transactional and Result Handling OGC SOS 2.0 requests are listened for and their parameters are persisted into the Elasticsearch.
  • The ExceptionEvents are also caught and loaded into Elasticsearch.
  • The client’s source IP address is stored in the Elasticsearch database as a geo-location address. The transformation is provided by the Maxmind’s GeoLite2 databases. The geo-location information can be visualized on a Kibana tile-map.
  • All the persisted information could be visualized and queried in Kibana.

You can read the continuously updated technicalities on the Github Readme page. During the first half of the project, I built the the underlying structure of the Elasticsearch persistence layer. It doesn’t yet have a visually aesthetic user-friendly interface, but in the following weeks the structure established will allow me to introduce these features to the SOS admin user interface.

Changes in the SOS

I needed two minor changes in the SOS API modul (which are now integrated into project iceland). One interesting metric we would like to know is the type of content format the clients request (XML, JSON, CSV, …) and in which format they send their request. In order to have this data available, the RequestContext was extended with two new fields.

The other modification concerned the ResponseEvent class which is now fired via the ServiceEventBus when the response for a request is available. This way we can try to measure the execution time of requests.

Testing

There are no large Java classes currently in the statistics module (no more than a couple of hundred lines) so testing one class without it’s dependency doesn’t give much value. As a reuslt, the interaction testing between objects requires more effort.

For some of the test classes, I used the Mockito framework. For instance, I mocked the DataHandler service, which is responsible for storing the data in the Elasticsearch database without a functioning database.

On the other hand, programming a local fully functioning Elasticsearch cluster for testing is not a great challenge in Java. It is relatively a fast process (13,75 seconds to set up one on average on my laptop). There is one caveat though, and that is that the Elasticsearch database is “near-real-time”. This means the cluster’s refresh interval can be 1 second or more, or disabled (directed by the index.refresh_interval configuration parameter). Explicitly calling the refresh action on the Elasticsearch index doesn’t help, so to overcome this barrier I wait (Thread.sleep) 1,5 seconds before I query the database for my assertions in my test classes.

I have one impediment testing-wise. The statistics logging module stores the clients geolocation, which is extracted from the HttpServlet (to be accurate the IPv4 address is extracted and then it is transformed to a geolocation). It is not trivial for me know what the solution should be to simulate the client with different source IP addresses.

Kibana interface

Once the information is stored in the Elasticsearch database, the Kibana interface can be set up easily.

After generating a lot of request with SoapUI, the Kibana interface looks like the following screen shot in the discovery mode. On the left you can see a couple of parameters recorded and on the right the raw requests.

discover-mode

Discovery mode and raw data

A basic bar chart can be created easily to see which operations were called the most.

number-of-requests

Operations grouped by their name

Based on the latitude and longitude coordinates, the data can be displayed on a map.

Clients request by geolocation

Clients request by geolocation

Future work

The iceland project is ready and so is the SOS 5.x development branch of the Sensor Observation Service. The first thing will be to migrate the statistics module to be compatible with the new framework. This will include using the dependency injection capabilities provided by the Spring framework and using the Java 8 platform’s new language constructs.

0 Comments »

No comments yet.

RSS feed for comments on this post.

Leave a comment

Time limit is exhausted. Please reload CAPTCHA.