[time-nuts] Discarding outliers in two dimensions
Hal Murray
hmurray at megapathdsl.net
Wed Dec 9 10:53:06 UTC 2009
Suppose I want to average a bunch of samples. Sometimes it helps to discard
the outliers. I think that helps when there are two noise mechanisms, say
the typical Gaussian plus sometimes some other noise added on. If the other
noise is rare but large, those occasional samples can have a big influence on
the average. So discarding those outliers gives better results, for some
value of "better".
I know how to do it in one dimension. How do I do it in two dimensions?
Say I have a lot of samples from a GPS system and I want to compute the best
position to use when shifting into timing mode.
For one dimension, you sort, compute the average, then compute the distance
of the first and last samples from the average. Discard the one that is
farther from the average.
The problem with two dimensions is I don't know how to sort.
Let's ignore efficiency. I can compute the average without sorting. I can
scan the whole list looking for the one that is farthest (radial distance)
from the average. Does that work (and do what I want)? (I think so, but I'm
not sure.)
Is there a way to do that efficiently?
--
These are my opinions, not necessarily my employer's. I hate spam.
More information about the time-nuts
mailing list