envirocar-py: Querying enviroCar Trajectory Data

A Python package to query and analyze enviroCar’s trajectory data

enviroCar

enviroCar is an open Citizen Science platform to collect, share and analyze floating car data for traffic quality and environmental monitoring. Its main components are the server back-end and the Android App. See envirocar.org for more details. This article focuses on the Python package envirocar-py, which enables users to query enviroCar’s open data and perform detailed analysis in Python.

envirocar-py

In general, eXtended Floating Car Data (XFCD) provide spatio-temporal profiles of vehicles equipped with GPS receivers, as well as various sensors to measure car and engine related parameters. The enviroCar Python package allows users to query and download XFCD data via the enviroCar Rest-API. GeoPandas’ flat GeoDataFrame stores the data received. Each track is represented as a chain of measurement data points and some metadata, such as measurement time and coordinates. The data frame provides not only sensor measurements, but also estimated CO2 emissions based on GPS recordings. For this estimation, we calculated the energy demand based on the vehicle movement, which we extracted from GPS recordings. Table 1 shows all GeoDataFrame variables requested via envirocar-py. A Jupyter Notebook on variable description in the envirocar-py package provides further details on the data and variables. Note that not all variables are available for all tracks, e.g. “Energy Consumption.value” is only stored for electric vehicles.

column name	data type
id	object
time	object
geometry	geometry
Engine Load.value	float64
Calculated MAF.value	float64
Speed.value	float64
CO2.value	float64
Intake Pressure.value	float64
Rpm.value	float64
Intake Temperature.value	float64
Consumption (GPS-based).value	float64
GPS Altitude.value	float64
Throttle Position.value	float64
GPS Bearing.value	float64
Consumption.value	float64
GPS Accuracy.value	float64
CO2 Emission (GPS-based).value	float64
GPS Speed.value	float64
track.id	object
track.length	float64
track.begin	object
track.end	object
sensor.type	object
sensor.engineDisplacement	int64
sensor.model	object
sensor.id	object
sensor.fuelType	object
sensor.constructionYear	int64
sensor.manufacturer	object
track.appVersion	object
track.touVersion	object
GPS HDOP.value	float64
GPS PDOP.value	float64
GPS VDOP.value	float64
MAF.value	float64
O2 Lambda Voltage ER.value	float64
O2 Lambda Voltage.value	float64
Energy Consumption.value	float64
Short-Term Fuel Trim 1.value	float64
Long-Term Fuel Trim 1.value	float64
O2 Lambda Current ER.value	float64
O2 Lambda Current.value	float64

Installation

The package requires a Python version >= 3.6. The package is available on the PyPI package manager and can be installed with the following command:

pip install envirocar-py --upgrade

To install envirocar-py in develop mode, use the following:

python setup.py develop

Example snippet of enviroCar API request

To request enviroCar data from the API, import the envirocar module, as well as the modules pandas and GeoPandas, to your Python Script. You also need the area of interest coordinates, which you can get e.g. here OpenStreetMap.

After installing envirocar, pandas and geopandas, request enviroCar Data by adding the following code snippet to your script:


# Import envirocar classes, pandas and geopandas to your Python script
import pandas as pd
import geopandas as gpd
from envirocar import TrackAPI, DownloadClient, BboxSelector, ECConfig

# Set configuration parameters by initializing the ECConfig class
config = ECConfig()

# Initialize an instance of the TrackAPI class which handles the API access
track_api = TrackAPI(api_client=DownloadClient(config=config))


# Define a bounding box of the area which you are interested in
bbox = BboxSelector([
    7.601165771484375, # min_x
    51.94807412325402, # min_y
    7.648200988769531, # max_x
    51.97261482608728  # max_y
])


# Issue a query by calling the get_tracks method of the TrackAPI class 
# which takes the bounding box and the number of tracks as arguments 
track_df = track_api.get_tracks(bbox=bbox, num_results=10) # requesting 10 tracks inside the bbox

Check out the examples folder in the Jupyter notebooks on GitHub for examples and explanations, e.g. how to match recorded tracks to a streetnetwork (‘map matching’).

Example analytics workflows

Users, developers and university students have recorded, shared and analyzed the open data. Several examples of Python scripts that analyse the data with different scopes are available on GitHub (see links to certain analysis packages in the relevant paragraphs below).

Hot Spot Analysis

Hot spot analysis is a statistical tool for identifying clusters of a specific phenomenon in a dataset. Regarding traffic management, phenomenons of interest can be, e.g., traffic density or CO2 emissions.

The research project CITRAM carried out a hot spot analysis of CO2 emissions in the city of Hamm. We presented the results of this analysis, which was based on Getis Ord statistics, in a blog post “Hot Spot Analysis of Floating Car Data”. A study project at the University of Münster applied various hotspot analyses workflows. These included different statistics (Getis Ord , Moran), spatial references for features (points, polygons, rectangular grids), weights (network distance, travel times) and phenomena (speed, CO2 emissions, vehicle stopping time). Access the code on GitHub.

Traffic safety

If you are interested in determining traffic safety, try the traffic safety analysis package. This package provides multiple tools to determine certain indicators for traffic safety as, e.g. lucky escapes, black spots, cold spots and speeding points. In addition, you can integrate OpenStreetMap and weather data to analyze spatio-temporal patterns of accidents. To dive deeper into (statistical) features of accidents, you can create statistical models with neural networks to compute probabilites, categories, trends and patterns. Check out the repository and relevant Jupyter Notebooks on GitHub.

Fuel consumption

You can also use data from the enviroCar platform to analyze the fuel consumption along the car tracks. We presented an evaluation of a GPS-based fuel consumption model at the EGU 2020. A dedicated Python library provides tools to estimate GPS-based fuel consumption and understand its sensitivity to different model parameters. The library is based on preliminary work done during a study project at the University of Münster.

Exploratory Data Analysis and Preprocessing

If the data have errors, outliers, missing values and noise (e.g. due to measurement distortions), analyses results will likely be of poorer quality and patterns may not be detected (i.e. ‘garbage in, garbage out’). Thus, we recommend that you first view and understand the data and then do some preprocessing. For this, you can use the functionality in the eda_quality repository on GitHub. It provides several tools specifically implemented for enviroCar data to help you gain insight into the data’s structure and information content, e.g. by visualizing tracks and viewing descriptive statistics. In addition, there are tools to determine the data quality and to apply some correction, e.g. by detecting outliers, duplicates and implausible values (e.g. negative speed values). The repository also provides some simple tools to prepare the data for machine learning. Check out the relevant Jupyter notebooks for more information.