HOWTO

We have prepared some simple programs in fortran90 to implement the maximum likelihood dataclustering method, and a few other standard methods. If you do not have a fortran90 compiler, follow this link.
In what follows, we suppose you have an ensemble of N data sets, each of length D.
The steps to follow are:
  1. You need to download the source code (e.g. grpsan.f90) and the corresponding parameter file (e.g. grpsan.par).
  2. The parameter file must be modified according to the characteristics of your dataset, and according to what you are trying to do. For example, in this file you specify how many data sets you have (N), in what range of "beta" (the fictitious temperature) you will run the simulation, and a conventional 3 letters prefix for the input/output files.
  3. You should check that the distribution of your data is not too different from a gaussian. Maybe you will find it useful to take the logarithm of the data sets, or consider using the Kendall's tau rather than the covariance matrix (see the first Phys. Rev. E paper...).
  4. To prepare the covariance matrix (Pearson's coefficients) of your data, you FIRST have to normalize the data, so that they have ZERO mean, and UNIT variance. In other words, subtract from each set its average, and divide each set by the square root of its variance.
  5. Be sure that the covariance matrix is written in the correct upper triangular form!
  6. Compile the code (you now how to do it, right?). If this step fails, let us know!
  7. Run the code. You can keep track of what is going on by looking a the files xxx.ent.yyy and xxx.now.yyy. They contain info on the current ground state, energy, temperature, etc...
  8. Relax. Simulated annealing can take some time! To have an idea, it took us about 12 hours to find the  ground state of N=2500 sets.... But at the beginnning you probably want to choose a faster annealing schedule, just to explore the "energy landscape" generated by your data.
Now that you have (hopefully) read these instruction, you can download the programs.