Develop Machine Learning model

Problem description

Machine learning could offer significant improvements to OpenEEW earthquake detection in terms of accuracy (less false positives/negatives) and speed (less stations required for confirmation of event).

OpenEEW sensors have been transmitting unprocessed data to the cloud since 2017 in Mexico, and captured hundreds of earthquakes including several of magnitude 6 and 7. A model trained on this data could be deployed on OpenEEW networks around the world (this is not a theoretical exercise!) and save many lives.

What is the goal?

To rapidly characterize the earthquake, ideally within a few seconds of the seismic waves reaching the sensor. Large earthquakes can last for a minute or more so we can't wait for the event to pass. It could be that the model is constantly updating its output so that more accurate estimates are being transmitted to nearby populations as the event goes on.

Establish an earthquake has occured

Initially, a model should be able to determine if an earthquake is occuring in the accelerometer data. It should know that the accelerations are caused by a seismic event and not, say, a truck passing by.

The start of the earthquake should determined from the start of the P-wave (initial wave).

Locate earthquake or get distance from sensor

The next challenge is to determine where it is coming from. This distance will be used in the next step.

Determine magnitude of earthquake

Finally, we should find the magnitude of the earthquake. For this we can use this equation:

M = 1.23xlog10(Pd) + 1.38xlog10(R) + 5.39

M = Magnitude Pd = Peak amplitude of the P-wave in centimeters R = distance of earthquake from the station


The dataset consists of 1 second records of accelerations from each sensor, sent every second. These records contain:

"country_code": "mx",
"device_id": "008",
"x": [2.723, 2.293, 2.75, ...],
"y": [3.134, 1.686, -4.081, ...],
"z": [10.636, 13.891, 13.084, ...],
"device_t": 1518824421.406,
"cloud_t": 1518824421.398,
"sr": 31.25

Each device also has a fixed location. You can query the latitude and longitude for each device_id from the locations file in the registry.

Labelling earthquakes

You can label seismic events in the dataset by cross-referencing with the Mexican Seismological Agency (SSN), which has a very good catalog that can be downloaded as a CSV file.

Please note that whilst very good, this catalog is not complete as small events can be missed, for example during vigorous aftershock sequences.

Performance metric

Reference projects

Obspy - Picking/Triggering with python library

CQSRG - Locating an earthquake

Isaac Kim - Fast P-wave detection

H.Serdar Kuyuk - Real-time classification of earthquake using deep learning

Zachary Ross - Phase detection


As a member of the NVIDIA Inception program, we are able to support promising advances with ML and OpenEEW data with access to GPU cloud virtual machines. Please write to use at if you are interested in applying.