This current state of the project is early beta, which means that features can be added, removed or changed in backwards incompatible ways.

We are very excited to announce RunPandas 0.3. This release comes with new features and fixes, let's highlight them:

  • Support to moving metrics, with the capability of detecting periods of inactivity.
  • Support to compute some running general statistics such as total time elapsed and moving time elapsed.
  • Support to imputated statistics: speed in m/s and total distance and distance per position.
  • Added Zenodo DOI badge

What is Runpandas?

Runpandas is a python package based on pandas data analysis library, that makes it easier to perform data analysis from your running sessions stored at tracking files from cellphones and GPS smartwatches or social sports applications such as Strava, MapMyRUn, NikeRunClub, etc. It is designed to enable reading, transforming and running metrics analytics from several tracking files and apps.

Main Features

Support to calculated running metrics: total elapsed time, speed and total distance

The Activity dataframe now contains special properties that presents some statistics from the workout such as elapsed time, speed and the distance of workout in meters.

#Disable INFO Logging for a better visualization
import logging
logging.getLogger().setLevel(logging.CRITICAL)
# !pip install runpandas
import runpandas as rpd
activity = rpd.read_file('./data/sample.tcx')

The total ellapsed time is the duration from the moment you hit start on your device until the moment you finish the activity. The total distance is the total of meters ran by the athetle in the activity. The speed is measured in meters per second, and returns a runpandas.MeasureSeries.Speed series with the ratio of the distance traveled per record and the number of seconds to run it.

Occasionally, some observations such as speed, distance and others must be calculated based on available data in the given activity. In runpandas there are special accessors (runpandas.acessors) that computes some of these metrics. We will compute the speed and the distance per position observations using the latitude and longitude for each record and calculate the haversine distance in meters and the speed in meters per second.

#total time elapsed for the activity
print(activity.ellapsed_time)
#distance of workout in meters
print(activity.distance)
0 days 00:33:11
4686.31103516
#compute the distance using haversine formula between two consecutive latitude, longitudes observations.
activity['distpos']  = activity.compute.distance()
activity['distpos'].head()
time
00:00:00          NaN
00:00:01     0.333146
00:00:06     1.678792
00:00:12    11.639901
00:00:16     9.183847
Name: distpos, dtype: float64
#compute the speed using the distance per position and the time recorded in seconds to run it.
activity['speed']  = activity.compute.speed(from_distances=True)
activity['speed'].head()
time
00:00:00         NaN
00:00:01    0.333146
00:00:06    0.335758
00:00:12    1.939984
00:00:16    2.295962
Name: speed, dtype: float64

In runpandas we will also have special atributes at the runpandas.MeasureSeries that can compute transformations such as speed conversion from m/s to km/h.

#kph property that converts m/s to km/h.
activity['speed'].kph
time
00:00:00          NaN
00:00:01     1.199325
00:00:06     1.208731
00:00:12     6.983941
00:00:16     8.265462
              ...    
00:32:51    11.864734
00:32:56    12.001946
00:33:02    13.600020
00:33:07    13.428554
00:33:11    13.155589
Name: speed, Length: 383, dtype: float64

Support to detection of periods of inactivity (Moving time)

With the advent of the advanced tracking devices, they are capable of estimating the time that the runner was active. Then new devices can now calculate the moving time based on the GPS locations, distance, and speed of the activity. There are cases that the athlete can also use the pause button to deliberately pause the activity for any reason (stoplights, active rests, bathroom stops or even stopping for photos).

Runpandas will attempt to calculate based on the metrics available in the activity the moving time by detecting all the periods of inactivity. The formula is based on the speed per record (distance recorded) below a specified threshold. It is a powerful metric that the runner can now know to see his real performance, removing any bias related to stopped periods. This metric is quite popular also in several tracking platforms such as Garmin and Strava.

With the new dataframe auxiliar method Activity.only_moving, runpandas detects the periods of inactivity and returns the moving series containing all the observations considered to be stopped. It returns a runpandas.Activity dataframe with a special column named moving indexed by the Activity's TimeIndex. It is pandas.Series containing a vector of booleans which indicates the stopped periods. Boolean indexing it will help build quick filters to ignore any observations considered by the algorithm as a inactivity.

activity_only_moving = activity.only_moving()
print(activity_only_moving['moving'].head())
time
00:00:00    False
00:00:01    False
00:00:06    False
00:00:12     True
00:00:16     True
Name: moving, dtype: bool

Now we can compute the stopped time and the moving time.

print('The stopped period:', activity_only_moving[activity_only_moving['moving'] == False].index.sum())
The stopped period: 0 days 00:00:07
print('The moving time:', activity_only_moving.moving_time)
The moving time: 0 days 00:33:05

What is coming next ?

We will load several running metrics and statistics to our activities and measure series in order to provide the user deeper details about their running activities. It will includes heart time zones, average speed, personal best records per distance, and more!

Thanks

We are constantly developing Runpandas improving its existing features and adding new ones. We will be glad to hear from you about what you like or don’t like, what features you may wish to see in upcoming releases. Please feel free to contact us.