Are tracks longer than routes?

This paper distinguishes between routes that are drawn over maps and tracks that are recorded with GPS devices. Both routes and tracks are polygonal path representations of the real paths that we travel over the earth. Tracks are very handy for obtaining distance and elevation gain. They can be downloaded from sites where users share the tracks from their trips. They have the advantage over routes in being an actual record of the path of the device while the user was hiking a trail. Routes are still necessary when there is no previously-recorded track.

Routes have practical limitations on the number of route points that can be drawn. They are generally subject to bias from undersampling. Tracks are subject to undersampling error to a lesser extent than routes, tracks usually being more detailed with more track points. In addition, tracks are subject to satellite error on the order of 5 or 10 meters per track point. If the device records at too high a sample rate, the track will overestimate distance and cumulative elevation gain due to the accumulation of satellite errors.

In summary, there is no reason to believe that routes and tracks will give the same unbiased estimate of distance and elevation gain. The purpose of this research is to assess the magnitude by which route-based estimates differ from track-based estimates for a sample of data that has been used for planning or could be used for planning. By paired comparisons, we difference the length measured by a route with one measured by a track, the two both being representations of the same earth path. We find that length measurements from tracks average 0.65 miles longer than those from routes. This difference is about 12% of the average length of the routes in the data set. Using the same methodology, we find that elevation gain measurements do not differ signficantly between routes and tracks.

Dependencies

The R code in this document depends on the following packages:

library('knitr')
library('dplyr')
library('ggplot2')
library('lmtest')
library('sandwich')

Route versus track data set

For this investigation, routes were existing designated routes on gaiagps.com or were drawn using gaiagps.com tools. Each route is compared with a corresponding track that had been recorded and uploaded by a GaiaGps user. There are 11 “designated” routes, 13 “free-drawn” routes and 10 “assisted-drawn routes. Designated routes are routes already stored at gaiagps.com and available to users for information about established trips. For”free-drawn“” routes, I sketched cross-country routes over terrain using gaiagps.com drawing tools. For “assisted-drawn” routes, I drew the routes using the GaiaGps snap-to-trail feature. This data set is a convenience sample of past or planned backpacking trips in the Sierra Nevada and Arizona.

For each route, I measured the length in miles (route.length), cumulative elevation gain in feet (route.gain) and number of points (route.points). I measured the length, gain and number of points for the corresponding track (track.length, track.gain, track.points). See (Ahrens 2018) for definitions and formulas for these variables. The route and track attributes were measured using GpsPrune software (Workshop 2018).

The format and listing of the data set are given in the section “Route versus track data set details” below.

# load the data
rvst <- read.csv('route_vs_track_data.csv')

Differenced length of route or track

We analyze the routes and tracks as paired comparisons. The difference in length (length_diff) is track.length - route.length.

Summary of length difference (track - route) for three route types

Summary of length difference (track - route) for three route types

Virtually all tracks are longer than their corresponding routes. The length difference is larger and has more variation for designated type routes than for the other two type routes.

One-way ANOVA of length difference

The simplest model treats each of the groups as independent random samples. We can test for a dependence of the length difference on route type. This uses a heterskedasticity-robust estimator of covariance and tests the route.type coefficients as a group.

lm_ld_routetype <- lm(length_diff ~ route.type, rvst_diff)
waldtest(lm_ld_routetype, length_diff ~ 1, vcov=vcovHC)
ct <- coeftest(lm_ld_routetype, vcov=vcovHC)
ct
## 
## t test of coefficients:
## 
##                      Estimate Std. Error t value  Pr(>|t|)    
## (Intercept)           0.64800    0.10584  6.1225 8.664e-07 ***
## route.typedesignated  0.34564    0.26068  1.3259    0.1946    
## route.typefree_drawn  0.24354    0.22639  1.0757    0.2904    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The Wald test did not find sufficient evidence for a dependence of length difference on route type.

## 95% confidence interval for intercept: 
##  0.4321392 0.8638608
## Intercept as percent of average route length: 
##  12.41309
## Standard error of regression: 0.6311838

The intercept coefficient is significant. There is sufficient evidence of a positive difference between track lengths and route lengths. The point estimate of the difference is 0.65 miles with a confidence interval (95%) on the difference of (0.43, 0.86) miles. As a percentage of the average route length, the 0.65-mile difference is about 12%.

The standard error of this regression is 0.63 miles, almost as large as the assessed mean difference.

Controlling for route length and number of points could change results. Here is a general model that includes route.length, route.points and track.points as explanatory variables.

lm_ld <- lm(length_diff ~ route.type + route.length + route.points + track.points, rvst_diff)
coeftest(lm_ld, vcov=vcovHC)
## 
## t test of coefficients:
## 
##                         Estimate  Std. Error t value Pr(>|t|)   
## (Intercept)           0.17209101  0.26686817  0.6449 0.524271   
## route.typedesignated -0.37319934  0.26967334 -1.3839 0.177324   
## route.typefree_drawn  0.14981754  0.24615608  0.6086 0.547677   
## route.length          0.04245285  0.06881196  0.6169 0.542259   
## route.points         -0.00078005  0.00027768 -2.8092 0.008954 **
## track.points          0.00061272  0.00019229  3.1865 0.003523 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
waldtest(lm_ld, . ~ . - route.points - track.points, vcov=vcovHC)

Now we find the difference between the length of a route and track can be explained by the difference in numbers of route points and track points. When these variables are introduced, the intercept is no longer significant. However, the variables route.points and track.points as a group have an F-test signficance of p < 0.001.

Holding other variables fixed, an increase of 100 in the number of points in the route results in a decrease in length difference of 0.08 miles. An increase of 100 in the number of points in the track results in an increase of length difference of 0.06 miles. Indeed, the similar magnitude and opposite signs of these coefficients suggest that track.points - route.points might be suitable predictor by itself.

Is track.points - route.points a significant explanatory variable?

We will try substituting track.points - route.points for track.points in the last model.

rvst_diff2 <- rvst_diff %>% mutate(delta.points=track.points - route.points)
lm_deltap <- lm(length_diff ~ route.type + route.length + route.points + delta.points, rvst_diff2)
coeftest(lm_deltap, vcov=vcovHC)
## 
## t test of coefficients:
## 
##                         Estimate  Std. Error t value Pr(>|t|)   
## (Intercept)           0.17209101  0.26686817  0.6449 0.524271   
## route.typedesignated -0.37319934  0.26967334 -1.3839 0.177324   
## route.typefree_drawn  0.14981754  0.24615608  0.6086 0.547677   
## route.length          0.04245285  0.06881196  0.6169 0.542259   
## route.points         -0.00016733  0.00033899 -0.4936 0.625434   
## delta.points          0.00061272  0.00019229  3.1865 0.003523 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

delta.points is the only significant term in this model. An increase of 100 points in the difference between numbers of track points and route points results in an increase in length difference of 0.06 miles.

Estimation model

A practical model for estimation leaves out all of the explanatory variables except delta.points.

lm_deltap2 <- lm(length_diff ~ delta.points, rvst_diff2)
summary(lm_deltap2)
## 
## Call:
## lm(formula = length_diff ~ delta.points, data = rvst_diff2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.27927 -0.24177 -0.06035  0.29640  1.49575 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.3504808  0.1526144   2.297 0.028341 *  
## delta.points 0.0005454  0.0001346   4.051 0.000304 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5184 on 32 degrees of freedom
## Multiple R-squared:  0.3389, Adjusted R-squared:  0.3183 
## F-statistic: 16.41 on 1 and 32 DF,  p-value: 0.0003038
coeftest(lm_deltap2, vcov=vcovHC)
## 
## t test of coefficients:
## 
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.35048076 0.13142401  2.6668 0.011910 *  
## delta.points 0.00054539 0.00013773  3.9599 0.000392 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

An increase of 100 points in the difference between numbers of track points and route points results in an increase in length difference of 0.05 miles, and this slope is very significant (p < 0.0004).

This model suggests that we might make length corrections, based on number of points, with the idea of standardizing length measurements to some fixed density of track points. However, after controlling the difference in number of points, tracks are still average 0.35 miles longer than routes (p < 0.012). Moreover, the residual standard error of the model is still 0.52 miles.

Here are the residuals plots for the last model.

plot(lm_deltap2)

Referring to standarized residuals versus fitted values, variance of residuals may be smaller for small length differences than for others. The distribution of these residuals is consistent with the normal distribution.

Analysis of elevation gain

We repeat the above analysis, but using track.gain - route.gain as the response variable.

Elevation gain difference is gain_diff = track.gain - route.gain.

rvst_gain <- rvst %>% mutate(gain_diff=track.gain - route.gain)

Here is a boxplot of the gain difference.

# box and whisker of differences. 
ggplot(rvst_gain, aes(x=route.type, y=gain_diff)) +
  geom_boxplot()
Summary of gain difference (track - route) for three route types

Summary of gain difference (track - route) for three route types

Elevation gain for tracks tends to be higher than for routes, but it is not uniformly so. The data exhibits variation that depends on route type.

This is the one-way ANOVA on the elevation gain response.

lm_gain_routetype <- lm(gain_diff ~ route.type, rvst_gain)
coeftest(lm_gain_routetype, vcov=vcovHC)
## 
## t test of coefficients:
## 
##                      Estimate Std. Error t value Pr(>|t|)
## (Intercept)            -0.300     16.148 -0.0186   0.9853
## route.typedesignated   32.391     50.009  0.6477   0.5219
## route.typefree_drawn   26.838     58.190  0.4612   0.6479

We did not detect a difference in elevation gain between routes and tracks.

lm_gain <- lm(gain_diff ~ route.type + route.length + route.points + track.points, rvst_gain)
coeftest(lm_gain, vcov=vcovHC)
## 
## t test of coefficients:
## 
##                         Estimate  Std. Error t value Pr(>|t|)
## (Intercept)          -119.417321   79.978579 -1.4931   0.1466
## route.typedesignated  -57.175043   66.617285 -0.8583   0.3980
## route.typefree_drawn   55.284721   89.295780  0.6191   0.5408
## route.length           36.997371   32.102799  1.1525   0.2589
## route.points           -0.087173    0.083175 -1.0481   0.3036
## track.points           -0.029150    0.108333 -0.2691   0.7898

We detect no intercept or trend in elevation gain difference when controlling for other explanatory variables.

Conclusion

Tracks are longer than their corresponding routes. The only external factor that we found that significantly explains the variation in lenght difference is the relative number of track and route points.

Tracks and routes did not show any net difference with regard to elevation gain.

This analysis does not settle which method has less bias relative to the length of the real earth path. The data set does not have a reference with which to compare measured length. Moving forward, we need to assume that both sources of measurement have some bias. The analysis does show that methods for estimating path length can differ by 0.65 miles or about 12% of the estimated length.

Route versus track data set details

The rvst data set consists of 34 observations on 8 variables. Format is .csv.

Designated routes are routes already stored at gaiagps.com and available to users for information about established trips. “Free-drawn”" routes are sketched cross-country routes over terrain using gaiagps.com drawing tools. “Assisted-drawn” routes are drawn using the GaiaGps snap-to-trail feature.

See (Ahrens 2018) for formulas for the numeric variables.

# load and look at the data
rvst <- read.csv('route_vs_track_data.csv')
#kable(rvst, caption =  "Route versus Track Data Set")
rvst
summary(rvst %>% transmute(route.length=route.length, track.length=track.length))
##   route.length     track.length   
##  Min.   : 1.990   Min.   : 2.480  
##  1st Qu.: 2.910   1st Qu.: 3.630  
##  Median : 3.975   Median : 5.275  
##  Mean   : 5.220   Mean   : 6.073  
##  3rd Qu.: 7.530   3rd Qu.: 8.803  
##  Max.   :10.800   Max.   :13.000

Note that the mean length of a route is 5.22 miles, while the mean length of a track is 6.07 miles.

References

Ahrens, F. 2018. “Documentation of the Fredsempirical Formula for Estimating Travel Times from Route Distance.”

Workshop, Activity. 2018. “GpsPrune.” Available at http:// activityworkshop.net/software/gpsprune/ (2019/11/10).