Steppi:
          An information-based Change-Point Analysis tool for
          biophysical applications implemented in MATLAB.
      
      
      Abstract: Change-point analysis is a flexible and
      computationally tractable tool for the analysis of times series
      data from systems that transition between discrete states and
      whose observables are corrupted by noise. The change-point
      algorithm is used to identify the change-points, the times at
      which the system transitions between states and fit the model
      parameters in each state. We present a unified approach to the
      analysis of processes whose noise can be modeled by Gaussian,
      Wiener or Ornstein-Uhlenbeck Processes. Using explicit,
      closed-form algebraic expressions for maximum-likelihood
      estimators of model parameters and estimated information loss of
      the generalized noise model that can be computed extremely
      efficiently. Next, we demonstrate that a Change-Point Analysis can
      be implemented using a single statistical test (Frequentist
      Information Criterion) that depends on the number of parameters
      fit per state and the number of observations. This approach
      reconciles two previously disparate approaches to Change-Point
      Analysis (test-statistic and model-selection criterion) for
      testing transitions between states. The use of the information
      criterion significantly simplifies the statistical analysis and
      facilitate the the calculation of explicit expressions for the
      resolution of the technique for determining small changes in the
      model parameters. We expect this technique to be of general
      interest to experimental investigators interested in biological
      systems. Applications of this analysis include molecular-motor
      stepping, fluorophore bleaching, electrophysiology, particle and
      cell tracking, detection of copy number variation by sequencing,
      tethered- particle-motion etc.
      
      
References: Biophysical Journal
      and 
Neural Computation.
      
      
How change-point analysis works: The change-point
      algorithm is used to identify the change-points, the times at
      which the system transitions between states and fit the model
      parameters in each state. The parameters describing each state are
      assumed to be stationary (i.e. not changing in time). The time
      evolution of the signal is represented by transitions between the
      states. In our biophysical implementation, we parameterize the
      state signal with four types of parameters, illustrated in the
      figure below:
      
    
 
    
      
      State model schematic. The state model signal is
      characterized by four model parameters that are written as the
      vector θ ≡ (k, ε, μ, α). Above we schematically illustrate the
      role of each parameter in shaping the signal. The parameter k
      parameterizes the standard deviation of the noise (σ = k^−1/2).
      State two illustrates the effect of the finite lifetime of
      fluctuations in models with autoregression (0 < ε < 1).
      State three illustrates the role of the level mean μ. State four
      illustrates of the role of the level slope (α).
      
      There are three choices for each of these parameters: (i) They may
      be set by hand, (ii) They may be chosen to have an unknown but
      global value (i.e. shared between all states) or (iii) they may
      have an unknown but local value (determined for each state
      independently.)
      
Examples of the Steppi
        Package 
      
      Download steppi, scripts and data 
here.
      
      
(1) Wiener Process: Drift Diffusion As an example of a
      Wiener Process, we present the example of diffusion with a bias in
      one dimension. (The code works in higher dimension as well.) We
      simulated data so that the true model is known. The system
      transitions back and forth between states with diffusion
      coefficients D = { 0.25, 2.5 x 10^-3}/2. The state with the lower
      diffusion constant has a small drift velocity with a random
      orientation.
      
      
 
        Raw data. The raw particle trajectory
          is shown above. The transitions between states with large and
          small diffusion coefficients are clearly visible. (E.g. a
          transition occurs at t = 1,600.)
        
        
         
        Change-point analysis of signal. Steppi
          determines the positions of the change points and fits the
          model parameters (a diffusion constant (i.e. stiffness
          
)
          and drift velocity (i.e. level slope
          
)
          in each state. The trajectory is colored by state with the
          state number shown at the top of the figure. The true number
          of states is recovered (n = 13).
          
        
 
        
         
        Model parameter values. The 95%
          confidence regions for the model parameters for each state is
          shown, in addition to the MLE values. The true diffusion
          constants are
          
2D
          = { 0.25, 2.5 x 10^-3}, in excellent agreement with the
          analysis.
        
 
        
      (2) Gaussian Process: Motor stepping
      As an example of a Gaussian Process, we present the example of
      motor stepping. Again, we simulate the data so that the true
      distribution is known. In the change-point analysis, we treat the
      motor position as the level mean with an unknown position (unique
      for  each step) and a global unknown stiffness.
      
      
      Raw data.  In the simulated data, the step length is
      1. The step-like transitions are clear by eye.
      
      
      Change-point analysis of signal. Steppi determines the
      positions of the change points and fits the model parameters
      (level mean
      
) and
      global stiffness (
).
      
      
      Fourier transform of the pairwise distribution function. A
      dotted line shows the peak with the greatest power, corresponding
      to the unitary step length, in excellent agreement with the
      simulated stepsize 1.
      
(3) Ornstein-Uhlenbeck Process: Tether Particle Motion 
      As an example of an Ornstein-Uhlenbeck Process, we present the
      example of the analysis of Tethered-Particle-Motion. In short, a
      bead diffuses on a a DNA tether. The DNA tether is approximated by
      a linear spring. The motion of the bead is therefore subject to
      three driving forces: (i) thermal fluctuations, (ii) damping
      forces from viscosity and (iii) forces from the deformation of the
      DNA tether.
      
      On short time scales, the system behaves like a Wiener Process:
      dominated by thermal fluctuations and viscus damping forces. On
      long time scales, the system behaves like a Gaussian Process:
      dominated by thermal fluctuations and forces from the DNA
      tether.  
      
      
      
      
      Raw data.  Simulated Tethered-Particle-Motion with
      short and long tether lengths. The step-like transitions are clear
      by eye.
      
      
      Change-point analysis of signal. Steppi determines the
      positions of the change points and fits the model parameters
      (level means
      
),
      stiffness (
) and
      nearest neighbor coupling (
).
      
      
      Model parameter values. The 95% confidence regions for the
      model parameters for each state is shown, in addition to the MLE
      values.