# PJ Steiner:CRFall

Jump to: navigation, search

Notes on using Kevin Murphy's CRF matlab toolbox.

## Getting it to work

To get testCRF to work, you'll probably need to recompile repmatC.c in the directory KPMtools. (It won't work if you're on *nix, and it won't be found if you're on OS X. I haven't tried on Windows.) Do this using mex. It may be on your path in *nix:

 mex /path/to/CRFall/KPMtools/repmatC.c


It won't be on your path in OS X:

 /Applications/MATLAB_R2008b.app/bin/mex /path/to/CRFall/KPMTools/repmatC.c


## Linear Chain CRFs (crfchain)

I'm only working with the functions for linear chain CRFs. I think what follows is correct.

### Limitations

You don't define 'feature functions' normally --- you don't define arbitrary functions of the form $f_k(t, y_{t-1}, y_t, \mathbf{x})$ each with a weight λk.

Instead, you compute a 'feature vector' for each position in the sequence which is input to both the training and inference functions. Note that the symbol at position i has to be part of this feature vector. (For DNA, I defined four features corresponding to {A, T, G, C} and set the appropriate feature to one for each position.)

This effectively restricts you to two types of feature functions:

• fk(yt − 1,yt)
These are potentials between states, and they're constant (can't be sequence dependent). The code learns these automatically.
• $f_k(t, \mathbf{x})$
These are the features in the feature vectors. Their values depend only on the input sequence. That sounds bad, but, the code learns weights λkq --- each is specific to a feature k and a state q.

I don't think this is as powerful as a general linear chain CRF, but it's still more powerful than an HMM.

### Training

To train, you need cell array of matrices X{s}(f,t) and a cell array of row vectors Y{s}(t). X{s} is a matrix --- each row corresponds to a feature, and each column corresponds to position in the sequence. Y{s} is a row vector containing the correct state labels for the sequence. (I believe state labels must be integers --- they're used to index matrices.)

The function crfchaintrain does the training:

 chain = crfchaintrain(chain, X, Y,         ...
'gradAlgo', 'scg',   ...
'checkGrad', 'on',   ...
'MaxIter', max_iter, ...
'verbose', 1);


Only chain, X, and Y are required arguments; the others are options.

Note: I've found that increasing the number of states (i.e. labels) can really increase running time.

### Inference

To infer hidden states using a trained CRF chain, you need a matrix X_test(f,t) of feature vectors for the unlabeled sequence. (As above, rows correspond to individual features, columns correspond to elements in the sequence.) You infer like so:

 bel_chain = crfchaininfer(chain, X_test);
[V, I] = max(bel_chain);


crfchaininfer computes probabilities for each state at each symbol in the sequence. (Rows are states, columns are symbols.) The call to max gives you two row vectors. V has the biggest energy/probability for each element in the sequence; I has the index into each column (i.e. the state) corresponding to that biggest value.