(EDIT: I had previously stated that fetching classified objects would return objects that are deliberately close to the decision boundary. This was ONCE the case, but no longer is! This means the method discussed below for assessing accuracy of classifier can not be expected to give a worst-case accuracy. I have changed the text below so as not to confuse people who find this post in the future. My apologies for any confusion.)
Cross-validation is on our long-term todo list.
We now advise users to use the object fetching tools in classifier to assess accuracy. Here’s a snip from the manual in progress:
One way to gauge the classifier’s performance is to fetch a large number of objects of a given class (e.g., positive) from the whole experiment. The fraction of the retrieved objects that you would classify as actually matching that particular phenotype should give you an idea of the classifier’s general performance. For example: if you fetch 100 positive objects but find upon inspection that 5 of the retrieved objects are not positives, then you can expect the classifier to have roughly a 5% false positive rate. The same method can be used to determine an approximate false negative rate (in the case of 2 classes).
Another way to gauge the classifier’s performance is to use the “Score Image” button as described in III.D.5, which will allow you to see qualitatively how the classifier performs on a single image. Although the results cannot be reliably extrapolated to the remaining images, it can be useful to examine control images and further refine the rules by adding misclassified objects in those images to the proper bins.[/quote]
– I hope this is helpful.
As for the max number of rules, here’s another manual clip:
[quote]•During initial training, it is best to use a small number of rules (5-10) to make sure that you do not define your phenotype too narrowly. That is, you want to identify a wide variety of objects that display your phenotype but differ in their other characteristics.
•As training proceeds, you can increase the number of rules and thereby improve the accuracy. If your output contains many repeated rules, you can safely lower the maximum number of rules, thereby saving time.
•Increasing the number of rules above 100 rarely improves the classification accuracy.
•For complex object classes (that is, to the human eye it involves the assessment of many features of the objects simultaneously), we recommend ultimately using 50 rules, based on our experience with 14 phenotypes in human cells (Jones, T.R., et al., PNAS 2009).[/quote]
So while we aren’t brave enough to give an “empirical” number, we’re pretty confident that 100 is the “practical” maximum.