Thursday, April 12, 2007

Training

Finished training the classifier on the open access corpus.
I think

BUILD SUCCESSFUL (total time: 756 minutes 37 seconds)

says it all. It took a while.
The table has 1,080,554 unique words in it.

Here is some output from 1000 sections

Sect Corr Incor Precis Recall F-Measure
INTROD 125 47 0.7267 0.5507 0.6266
METHOD 195 121 0.6171 0.9701 0.7544
RESULT 182 96 0.6547 0.8922 0.7552
DISCUS 206 28 0.8803 0.5598 0.6844
Correct: 708 proportion correct: 0.708 percentage correct: 70.8
Incorrect: 292 proportion incorrect: 0.292 percentage incorrect: 29.2