Hi @ZCs,

I appreciate your thinking about this, and will try give some ideas. Again, way to go for thinking about this. Yes, itâ€™s a hard problem, and there are not â€śeasyâ€ť answers.

These few thoughts / ideas are not concrete answers, but If youâ€™re more specific about methods / algorithms, I can try to be more specific. Since you mentioned segmentation in particular, Iâ€™ll talk about that.

## Philosophical thoughts

Many segmentation methods are optimization methods, so its possible to confirm that the optimization â€śpartâ€ť is working correctly: i.e. is a local minimum of the cost found. I would not call methods like this â€śheuristicâ€ť methods because there is a specific costs function that they optimize, and they can and do successfully optimize the given cost.

For example, usually a heuristic is used to choose the number of clusters when using Kmeans. But given K, Kmeans itself I would not call â€śheuristicâ€ť.

Sometimes those methods perform badly though. Not badly in that they failed to optimize the cost, but badly in that the cost we gave them doesnâ€™t reflect the thing we really want: â€śsegmentâ€ť the object != minimize the functional I made up. Heuristic methods can perform better because sometimes the functionals we know how to write down using math and optimize are not good at the task we care about.

## Tests

I cannot find an automated way of assessing an algorithm (e.g. segmentation) without checking the result by the naked eyes.

Related to @masamihaga 's suggestion

Generate synthetic images for which you

- know the correct answer
- know adhere to the assumptions of your algorithm

That lets you separate algorithmic errors from random noise. On point (2) for example, if youâ€™re using /debugging an MRF-based algorithm, you can generate images that are markov.

## Guilt

feel guilty of not being able to *prove* that a method works, which is crucial in research

Maybe it will help your guilt to separate â€ścorrectnessâ€ť from â€śworkingâ€ť, where:

- â€ścorrectnessâ€ť - the algorithm does what it claims to do
- â€śworkingâ€ť - the algorithms produces (close to) correct results on real data

A â€ścorrectâ€ť algorithm may not always â€śwork.â€ť