Day’s Verse:
Remember this: Whoever sows sparingly will also reap sparingly, and whoever sows generously will also reap generously.
2 Cor. 9:6

What do I do when things at work are blah?

PLAY WITH DATA, OF COURSE!

I have this handy-dandy GPS device that records data on every bike ride that I remember to turn it on for. That means that when I say to myself, “It feels like I’m not getting any faster on my commutes home from January to now,” I don’t have to go based on gut feeling. I actually have a way of determining whether that is accurate or not. Enter lots of data mining over the course of a few days, some fiddling around with the Excel 2007, and voila! (Note: I have hidden most of this post under the fold, as it’s long, math-and-science heavy, and probably not of interest to most of you.)

HYPOTHESES
Hypothesis 1: My average speed has increased from January, 2010 to August, 2010.
Hypothesis 2: I ride faster in the mornings than in the evenings.
Hypothesis 3: I ride faster on Mondays than on Fridays.

ANALYSIS AND RESULTS

Average Speed: All Rides, All Bikes
BAW Commute Avg Speed over Time

According to Math*, I averaged 14.0 mph in January and I now average 14.7 mph. That’s for all commutes to or from the Bicycle Alliance of Washington from the time I started working there in January through August 4, 2010. Is this a significant change? And even if significant, is it meaningful? Time to apply some logic, followed by some more math. Variables to consider:

  1. Different routes: Check average speed on short rides versus long rides. The difference in distance could cause variation in speed that isn’t meaningful.
  2. Morning and evening routes: Compare AM to AM and PM to PM. I ride the same morning route the majority of the time, but I consistently ride the same different evening route. It works out to be one big around-the-lake loop. Comparing average speed on 10 miles through Bellevue to a 10-mile route on the Burke-Gilman Trail isn’t exactly comparing apples to apples.
  3. Different bikes: Include only routes ridden on one type of bike. I rode the same route on 3 different bikes — Artemis, the fast bike; the Red Bike, a comfy hybrid; and Charlotte, the heavy Xtracycle-equipped bike — and riding it on Artemis versus on Charlotte can make a world of difference for average speed.

Those variables I can control for. Other possible interfering variables I can’t easily take into account, but that might have a significant impact on speed:

  1. Temperature. Although there’s some debate among bicyclists about this, I’m pretty sure that riding at 15°F is slower than riding at 70°F.
  2. Wind. Riding into a headwind both ways — which happens more than you’d expect, and far more often than I’d want (which is never) — will dramatically slow a bicyclist down. I’m not sure how I’d account for that without doing way more work than it’s worth right now.
  3. Light. Riding down a pitch-black bike path in February with only my front light for illumination requires slower speeds just from a safety standpoint; I doubt I exceeded 13 mph on full-dark evening rides. Riding in broad daylight, I don’t have to limit my speed due to sight distance.

First of all, let’s take a look at the morning versus evening routes. Splitting out the AM rides and the PM rides, I compared the mean of each using a two-sample t-test assuming equal variances (see table below).

Morning vs. Evening Average Speed
[table id=2 /]

The morning average speed was 14.49 mph; the afternoon average speed was 14.00 mph. According to the t-test, the one- an two-tail p-values were both significantly less than 0.05 (p<0.000098 and p<0.00020, respectively). This tells us that there's a very low probability that the hypothesis that morning speed and evening speed are the same. Essentially, there was a statistically significant difference between my morning average speed and my afternoon average speed: I really do ride faster in the mornings than in the afternoon. Therefore, for any meaningful average speed comparison, I have to only compare morning to morning and evening to evening.

Using a linear regression for comparisons, we found obtained the following results.

Linear Regression Statistics Comparing All Morning Rides
[table id=3 /]
[table id=4 /]

The important thing to notice here is the P-value for the Count: it’s very low. This means there’s a low probability that my speed now is the same as my speed in January: In the morning, my average speed has increased by a statistically significant amount. What about evening?

Linear Regression Statistics Comparing All Evening Rides
[table id=9 /]
[table id=10 /]

Again, look at the Counts P-value: This isn’t so small. In fact, it says that there’s a greater than 5% chance my evening speed has not changed between now and January. What an interesting result! Even though I’ve gotten faster in the morning, these data indicate that I still go the same speed in the afternoon. However, when we examined the data more carefully, we noticed three outliers: Most of the time, I averaged 14 mph in the evening; three times, I averaged closer to 11 mph. Some consideration let me to recall that on those particular evenings, I had ridden home with somebody else who averaged slower than I usually did. Those outliers were probably dragging my overall average down.

Moving on, however, we decided to account for the difference in bike as well as time of day. Comparing all morning rides on Artemis,

Linear Regression Statistics Comparing Morning Rides on Artemis
[table id=7 /]
[table id=8 /]

Linear Regression Statistics Comparing Evening Rides on Artemis
[table id=5 /]
[table id=6 /]
These results indicate that not only did I get faster in the morning — which we would have expected — but I actually did get faster in the evening (excluding those particularly slow outliers). Now, if you’re like me, you like pictures. I’ve always been able to interpret graphs better than tables. For those of you who always draw a picture, here’s a graph summarizing those results. The x-axis is time (we just numbered them consecutively from 1 to the end, since actual date didn’t matter that much) and the y-axis is average speed in mph.

BAW Commute Average Speed Timeline - AM & PM, Artemis

The blue line indicates my average speed on Artemis in the morning, and the dark blue dotted line is a linear trendline of that data. It shows that when I started riding Artemis on the morning commute, I averaged 14.1 miles per hour, and now I average 15.3 miles per hour. The R-squared value and the equation for that trendline are shown above the key, and tell us that in the mornings I’m increasing by an average of almost 0.02 miles per hour per day.

The red line indicates my average speed on Artemis in the evening, and the dark red doted line is a linear trendline of the data. It shows that when I started riding Artemis on the morning commute, I averaged 14.0 miles per hour, and now I average 14.7 miles per hour (as demonstrated above, a statistically significant, if realistically minute, change). The R-squared value and the equation for that trendline are shown above the key, and tell us that in the mornings I’m increasing by an average of barely 0.01 miles per hour per day. In short, I’m getting faster faster in the mornings, and getting faster more slowly in the evenings.

Last, just out of curiosity, we examined my hypothesis that I get slower over the week. I expected to see a general downward trend in average speeds from Monday to Friday as well as morning to afternoon.
[table id=11 /]

These results essentially blow my “it’s later in the week, therefore I’m riding slower” excuse out of the water. You can see clearly that I ride faster in the mornings than in the evenings — the daily averages and the grand total average both reflect that result very clearly — but there’s no real difference between my morning speed on Monday and my morning speed on Friday. Similarly, Monday through Thursday evening speeds remain pretty constant. The only exception is Friday afternoon, when my average speed plummets. It’s not a Friday thing, though, because I generally ride fairly briskly Friday mornings.

DISCUSSION
Analysis of my biking data indicate that Hypothesis 1 (My average speed has increased from January, 2010 to August, 2010) is true, although not strongly so. Hypothesis 2 (I ride faster in the mornings than in the evenings) is definitely true, to an almost remarkable extent. Hypothesis 3 (I ride faster on Mondays than on Fridays) is absolutely false, and I have no physical excuse for dawdling on Friday afternoons.

Why these results?
Hypothesis 1: Well, you’d expect that I would get stronger over time as I became more fit and my body became accustomed to the route. As a side note, I have noticed that I’m less hungry and tired now than when I started riding in January: My body has adjusted to going that distance at that speed. However, I’m not really pushing myself to go faster, since I have to be energetic and useful at the end of each ride; in fact, I’m working on training myself to avoid the “bike commuter racing” mentality. Presumably this accounts for my incremental increase. I’d like to think that if I was training hard to get faster, I’d have a more noticeable increase in average speed.

Hypothesis 2: I think this has to do with rest. In the evening, I have about 12 hours between when I get home and when I leave in the morning, and I spend 7 to 8 of those hours asleep. My muscles get a good, long break. During the day, I have about 9 hours between arrival and departure, but I spend a good portion of that time up and walking around. Even not walking, I’m still moving quite a bit. I get close to zero rest for my legs during a normal workday. Also, I’m almost always mentally fatigued when I leave work, and a great deal of speed has to do with the desire and willpower to keep pushing even when it starts to hurt. In the mornings I have it, and in the afternoons I just don’t. I’ve spent the entire day struggling hip-deep through endless misery, and I’ve used my reservoir of mental fortitude that could go to pushing me to go faster on the way home.

Hypothesis 3: This is plain bizarre. I’ve thought I rode slower the later in the week it was, and I even coined the phrase “Friday legs” to describe the phenomenon. Now I find out it’s all in me ‘ead! I think what this really indicates is that my average speed isn’t limited as much by my physical ability as by my mental strength. I may be more tired by Friday, but it doesn’t show in the average speeds. I think there’s also the “saving yourself for the swim back” phenomenon going on here, too. On Mondays I don’t push myself because I know I have ten 20-mile rides to finish by Friday. If I pretended each morning that I only had to complete that one ride, perhaps my averages would change more.

* Applied a basic linear trendline, shown in red, to the data.
Note: Due to GPS inaccuracy, the same route varies between 18 and 20 miles and 1,000 and 2,000 feet in elevation. One day I turn it on and I’m at -234 feet. The next day, standing in the same place, I’m at 166 feet. Amazing.

One thought on “Head in Sand: Games With Statistics

  1. Thank you! Thank you! Thank you for not naming names in conjunction with your comment that “I had ridden home with somebody else who averaged slower than I usually did. Those outliers were probably dragging my overall average down.” 🙂

Leave a Reply

Your email address will not be published.