2:07 am - Tue, Sep 2, 2014
7:41 pm - Mon, Sep 1, 2014

Identifying the underlying trends in performance analysis by Dynamic Factor Analysis

Continuing the theme investigating the MARSS package for useful tools above is a mock-up of how I anticipate part 2 of cycling performance studies will play out. 

(To sign up for the study if you haven’t remember to give Jon a shout http://veloclinic.com/cyclists-needed-performance-modelling-study/)

So as I alluded to in the last post (http://veloclinic.tumblr.com/post/96390104623/noise-probability-and-deterministic-skeletons-in) determining the utility of models and their parameters is a multi-factorial probabilistic issue were first principles and CVs may not necessarily be the ultimate best predictors of utility.

Instead, I’m keeping an open mind (aka shotgun or fishing experiment approach) to considering a wide array of candidate models for validation in Part 1 of the study. The issue then is how to go about declaring winners without making a priori declarations about the targets before hand.

One method is illustrated by the figures above. Here I used the phytoplankton data in MARSS as a mock up for some of the potential parameters that may come out of candidate models (humor me and just go along with this). 

As you can see in the first figure we have several years of data with what looks like a combination of seasonal variation and multi-year trends. One assumptions that can be made is that all of the observed parameters are slightly different views of a lesser number of underlying hidden states.

One of the way to estimate the number of hidden states is to use a Dynamic Factor Analysis. The second plot shows that for this data set that there are two underlying hidden states.

What we can also see is that the parameters load differently relative to these hidden states. Pmax and W1 load heavily on hidden state 1 while Tau1 and CP load heavily on hidden state 2. W2 loads heavily on both. 

From this factor analysis we have an indication that when it comes to performance (in this mock up) there are really just two hidden states that need to be monitored compressed down from the 5 potential parameters.

With the dynamic factor analysis done, there is a higher level of confidence that the hidden states represent orthogonal parameters that can then be used to monitor training response. 

3:50 pm
1 note

Uncanny resmemblances

http://www.ncbi.nlm.nih.gov/pubmed/19777251

as pointed out before the WKO4 model appears to be made up of the stringing together of 4 models that based on the distribution of their residuals function like a pmax model, a CP1-10 model, a CP3-30 model, and something that slopes down to cross the measured curve at the 45-60 minute range. the overlay of the CP models and the WkO4 model is previously illustrated in this post http://veloclinic.tumblr.com/post/72305824574/for-educational-purposes-only-post-contains-only

2:01 pm
Noise, probability, and deterministic skeletons in a simple exhaustion simulation
Reading about extinction risk metrics in http://cran.r-project.org/web/packages/MARSS/vignettes/UserGuide.pdf gives me some thoughts on modeling time to exhaustion that is probably a bit more mature than a straight deterministic t = W’/(P-CP). 
The first issue is to start the move from a deterministic centered view to a probabilistic one. In general what people have been doing so far is to take a take a deterministic approach ie if you deplete W’ you will become exhausted or something along those lines. From a probablistic perspective you would look at the problem as given the amount of W’ depleted what is the probability of exhaustion. 
For performance modelling, I think it is the probabilistic perspective will ultimately prove more useful because of some of techniques it opens up mathematically and because it is conceptually more useful.
As a quick example consider how a stage hunter and GC rider aproaches a final climb. In a deterministic view of the world you might say given W’ and CP this climb can be done at 5.9 W/kg. This might work reasonably well for someone hunting stages wins since if they try it enough times they will hit the top about as fast as they usually can. If they happen to explode before the top of the climb no major loss they try again tomorrow. For the GC rider on the other hand, cracking on one climb is potentially the end of their race. 
Now looking at it from a probabalistic perspective, one can say at 5.8 W/kg the chance of premature exhaustion might only be 10%, 20% @ 5.9 W/kg, and 50% @ 6.0 W/kg etc. In this case, it is clear why a GC rider would be taking a huge risk trying to follow an attack at 6.0 W/kg. Versus if they were going for a stage, it would make perfect sense to ride at 6 W/kg since it only matters if they win and how much time they lose doesn’t matter.
Possibly more important though is that if we are thinking in terms of probability it opens up a lot of interesting possibilities to use bayesian/hidden markov/state-space/dynamical-systems methods.
For example, the figure above are cumulative distribution plots of the probability of exhaustion (remaining W’ < .01*W’) starting a W’ and decreasing W’ by 0.075 at each time step (the deterministic skeleton) following the documentation for the MARSS package.
The black circles are the true probability of exhaustion generated by the deterministic skeleton. However, note that the probability is sigmoid rather than a square wave because of the inclusion of uncertainty in knowing the true state of W’.
Simulations are then run by creating data with the skeleton and including both random process noise as well as random observation noise. 
The green line is the result of fitting the data using the Dennis method which assumes no observation noise (effectively what we have been doing in considering power meter or ergometer data as an observation of truth) placing all of the noise estimate into process or model noise. The red line is the fit using a Kalman fliter where both process and observation noise are allowed.
In the upper left hand corner is the average from 9 simulations and the other panels show the individual fits. As you can see (since I left in a lot of noise) some of the individual fits crash and burn a bit. In the averaged plot, the Kamlan filter fit regresses to the truth as it is an unbiased estimate. The Dennis method, which places all error onto the process however, does as it introduces bias. 
And ?
One of the insights that come out of these simulations is that the effective application of any model is not just dependent on how much noise the model process introduces but ALSO how much noise the observation introduces. 

Noise, probability, and deterministic skeletons in a simple exhaustion simulation

Reading about extinction risk metrics in http://cran.r-project.org/web/packages/MARSS/vignettes/UserGuide.pdf gives me some thoughts on modeling time to exhaustion that is probably a bit more mature than a straight deterministic t = W’/(P-CP). 

The first issue is to start the move from a deterministic centered view to a probabilistic one. In general what people have been doing so far is to take a take a deterministic approach ie if you deplete W’ you will become exhausted or something along those lines. From a probablistic perspective you would look at the problem as given the amount of W’ depleted what is the probability of exhaustion. 

For performance modelling, I think it is the probabilistic perspective will ultimately prove more useful because of some of techniques it opens up mathematically and because it is conceptually more useful.

As a quick example consider how a stage hunter and GC rider aproaches a final climb. In a deterministic view of the world you might say given W’ and CP this climb can be done at 5.9 W/kg. This might work reasonably well for someone hunting stages wins since if they try it enough times they will hit the top about as fast as they usually can. If they happen to explode before the top of the climb no major loss they try again tomorrow. For the GC rider on the other hand, cracking on one climb is potentially the end of their race. 

Now looking at it from a probabalistic perspective, one can say at 5.8 W/kg the chance of premature exhaustion might only be 10%, 20% @ 5.9 W/kg, and 50% @ 6.0 W/kg etc. In this case, it is clear why a GC rider would be taking a huge risk trying to follow an attack at 6.0 W/kg. Versus if they were going for a stage, it would make perfect sense to ride at 6 W/kg since it only matters if they win and how much time they lose doesn’t matter.

Possibly more important though is that if we are thinking in terms of probability it opens up a lot of interesting possibilities to use bayesian/hidden markov/state-space/dynamical-systems methods.

For example, the figure above are cumulative distribution plots of the probability of exhaustion (remaining W’ < .01*W’) starting a W’ and decreasing W’ by 0.075 at each time step (the deterministic skeleton) following the documentation for the MARSS package.

The black circles are the true probability of exhaustion generated by the deterministic skeleton. However, note that the probability is sigmoid rather than a square wave because of the inclusion of uncertainty in knowing the true state of W’.

Simulations are then run by creating data with the skeleton and including both random process noise as well as random observation noise. 

The green line is the result of fitting the data using the Dennis method which assumes no observation noise (effectively what we have been doing in considering power meter or ergometer data as an observation of truth) placing all of the noise estimate into process or model noise. The red line is the fit using a Kalman fliter where both process and observation noise are allowed.

In the upper left hand corner is the average from 9 simulations and the other panels show the individual fits. As you can see (since I left in a lot of noise) some of the individual fits crash and burn a bit. In the averaged plot, the Kamlan filter fit regresses to the truth as it is an unbiased estimate. The Dennis method, which places all error onto the process however, does as it introduces bias. 

And ?

One of the insights that come out of these simulations is that the effective application of any model is not just dependent on how much noise the model process introduces but ALSO how much noise the observation introduces. 

6:47 pm - Sun, Aug 31, 2014

Stage 9 2014 Vuelta Quick Analysis

Just a quick one note here as this Vuelta so far is just following the curve. For the top riders on the day, the performances normalize out in the 6.2-6.3 naW/kg. Valverde and Froome were off the pace a touch at around 6.1 W/kg. 

time and climb data courtesy of @ammattipyoraily

for more on power estimate modelling

http://veloclinic.com/intro-to-power-estimate-modelling/

8:07 am - Sat, Aug 30, 2014
11:06 pm - Thu, Aug 28, 2014
1 note

2014 Vuelta Stage 6 Quick Performance Analysis

image

At only about 13 minutes of climb time, Altos Cumbres Verdes is a bit short to use the pVAM model. But it is still somewhat illustrative to take a look at how the performance estimates compare to the 2008-2013 reference range.

The raw nW/kg estimates came in between 6.4-6.5 nW/kg. Since this was a relatively low altitude climb the power comes down to 6.3-6.4 naW/kg when normalized for altitude for sake of the historical comparison.

From this first climb it does look like the favorites are on good (especially considering the heat) but no necessarily shocking form. We may just get the Froome Contador battle hoped at the TDF as both Froome and Contador likely to ride into form as the race progresses. Look for Quintana to bounce back on stage 9 as altitude will be a bit more of a factor.

10:06 pm
2 notes
9:17 pm - Tue, Aug 26, 2014

videos in support of socks

9:09 pm

an article about saving brains

realy

and

truly

can be about saving babies brains

it’s ok

to just write about a topic

and

stay on topic

Install Headline