Friday, September 26, 2008

The sea bass and the salmon


Its funny how complex (or so i thought) concepts can be so easily explained using simple real life examples.
To explain this, I will need to cite the example to the sea bass and salmon which was so aptly utilized to explain the Bayes classification to us. Consider that we need to create an automated process which can differentiate between salmon and sea bass. For this, we may need to consider features such as size, color, texture maybe. Of course, certain assumptions need to be in place such as the camera position: we don't want one half of the salmon coming and the other half of a sea bass coming....features may get indistinguishable. So, if we consider their probability distributions to be gaussian (the simplest case) then for minimum error in identification, the decision boundary needs to lie at the point of intersection between the two distributions. (As shown in figure)
However, here we introduce a new concept: minimum risk. Many times, it happens that the cost of identifying a salmon as a sea bass is not the same as identifying a sea bass as a salmon.
The salmon is a costlier fish and if a customer gets a sea bass instead, he can very likely sue and create trouble. So, it is better to create error in identifying a salmon as a sea bass (a customer ending up with a costlier fish instead of a cheaper one would most likely be happy) than vice versa.
This leads to the minimum risk which will shift the decision boundary to either direction depending on the cost of error.
Hence, tougher concepts can be explained in a layman manner ....and thanks to Dr S, it was an excellent and highly enjoyable lecture.....

PS: By probability distribution, I mean the probability of salmon/ sea bass occuring with respect to a particular feature i.e a plot of the feature (x) Vs the probability (y)...hope it makes things clearer.

2 comments:

Sandeep said...

I was expecting you to explain the false maxima and false minima :)

amiodarone said...

ummm... which would make it even more confusing?