[time-nuts] clock-block any need ?

Wed Jan 2 03:02:21 UTC 2013

Hi

The problem with your approach is that you can depart from "normal" for very long periods of time. Consider my home network, running NTP to external sources. Around 4 in the afternoon all the kids get home and start streaming video. At 7 I get home and start doing the same thing. We each keep this up for 5 hours. Past midnight, the bit torrent fires up and it runs through 5 AM. Mid day, there's a video conference that runs from home for an hour or two. 

Each of these things creates a non-zero load ahead of the NTP at some point. Given network congestion and re-transmission the load will really pile up at various times. Given the high level of transmit / receive  asymmetry in my cable modem, it will be pretty hard for me to figure out what's going on. 

The net result will be that my NTP hops around a bit during the day.

Bob

On Jan 1, 2013, at 8:57 PM, Dennis Ferguson <dennis.c.ferguson at gmail.com> wrote:

> 
> On 27 Dec, 2012, at 15:13 , Magnus Danielson <magnus at rubidium.dyndns.org> wrote:
>> On GE, a full-length packet is about 12 us, so a single packets head-of-line blocking can be anything up to that amount, multiple packets... well, it keeps adding. Knowing how switches works doesn't really help as packets arrive in a myriad of rates, they interact and cross-modulate and create strange patterns and dance in interesting ways that is ever changing in unpredictable fashion.
> 
> I wanted to address this bit because it seems like most
> people base their expectations for NTP on this complexity,
> as does the argument being made above, but the holiday
> intervened.  While I suspect many people are thoroughly
> bored of this topic by now I can't resist completing the
> thought.
> 
> Yes, the delay of a sample packet through an output queue
> will be proportional to the number of untransmitted bits in
> the queue ahead of it, yes, the magnitude of that delay can
> be very large and largely variable and, even, yes, the
> statistics governing that delay may often be unpredictable and
> non-gaussian, exhibiting dangerously heavy tails.  The thing is,
> though, that this doesn't necessarily have to matter so much.  A
> better approach might avoid relying on the things you can't know.
> 
> To see how, consider a different question: what is the
> probability that any two samples sent through that queue
> will experience precisely the same delay (i.e. find precisely
> the same number of bits queued in front of it when it
> gets there)?  I think it is fairly conservative to predict
> that the probability that two samples will arrive at a non-empty
> output queue with exactly the same number of bits in front of
> them will be fairly small; the number of bits in the queue will
> be continuously changing, so the delay through a non-empty queue
> should have a near-continuous (and unpredictable) probability
> distribution, as you point out, and if the sampling is uncorrelated
> with the competing traffic it is unlikely that any pair of
> samples will find exactly the same point on that distribution.
> 
> The exception to this, of course, is a queue length of
> precisely 0 bits (which is precisely why the behaviour
> of a switch with no competing traffic is interesting).  The
> vast majority of queues in the vast majority of network
> devices in real networks are no where near continuously
> occupied for long periods.  The time-averaged fractional load
> on the circuit a queue is feeding is also the probability of
> finding the queue not-empty.  If the average load on the
> output circuit is less than 100% then multiple samples are
> probably going to find that queue precisely empty; if the
> average load on the output circuit is 50% (and that would be
> an unusually high number in a LAN, though maybe less
> unusual in other contexts) then 50% of the samples that pass
> through that queue are going to find it empty.  Since samples
> that found the queue empty will have experienced pretty much
> identical delays, the "results" (for some value of "result")
> from those samples will cluster closely together.  The
> results from samples which experienced a delay will
> differ from that cluster but, as discussed above, will also
> differ from each other and generally won't form a cluster
> somewhere else.  The cluster marks the good spot independent
> of the precise (and precisely unknowable) nature of the statistics
> governing the distribution of samples outside the cluster.  If
> we can find the cluster we have a result which does not depend
> on understanding the precise behaviour of samples outside the
> cluster.
> 
> Given this it is also worth while to consider "jitter", which
> intuition based on a normal distribution assumption might suggest
> should be predictive of the quality of the result derived from a
> collection of samples.  In the situation above, however, the
> dominant contributors to "jitter", however measured, are going
> to be the samples outside the cluster since they are the ones
> that are "jittering" (it is that property we are relying on to
> define the cluster).  If jitter mostly measures information
> about the samples the estimate doesn't rely on then it tells you
> little about the samples the estimate does rely on, and hence
> can provide no prediction about the quality of an estimate
> derived from those samples alone.  In fact, in a true perversion
> of normal intuition, high jitter and heavy-tailed probability
> distributions might even make it easier to get a good result
> by making it easier to identify the cluster.  Saying "I see
> a lot of jitter" doesn't necessarily tell you anything about
> what is possible.
> 
> While the argument gets a lot more complex in a hurry, and
> too much to attempt here (the above is too much already), I
> believe this general approach can scale to a whole large network
> of devices with queues (though even the single-switch case has real
> life relevance too).  That is, I think it is possible to find a
> sample "result" for which there is a strong tendency for "good"
> samples to cluster together while "bad" samples are unlikely to do
> so, with the quality of the result depending on the population and
> nature of variability of the cluster but hardly at all on the
> outliers, and with the lack of a measurable cluster telling you
> when you might be better off relying on your local clock rather
> than the network.  The approach relies on the things we do know
> about networks and networking equipment while avoiding reliance on
> things we can't know: it mostly avoids making gaussian statistical
> assumptions about distributions that may not be gaussian.  The field
> of robust statistics provides tools addressing this which might
> be of use.
> 
> I guess it is worth completing this by mentioning what it
> says about ntpd.  First, ntpd knows all of the above, probably
> much, much better than I do, though it might not put it in
> quite the same terms.  If you make the assumption that the
> stochastic delays experienced by samples are evenly distributed
> between the outbound and inbound paths (this is not a good match
> for the real world, by the way, but there are constraints...) then
> round trip delay becomes a stand-in measure of "cluster", and ntpd
> does what it can with this.  The fundamental constraint that limits
> what ntpd can do, in a couple of ways, is the fact that the final
> stage of its filter is a PLL.  The integrator in a PLL assumes
> that the errors in the samples it is being fed are zero-mean and
> normally distributed, and will fail to arrive at a correct answer if
> this is not the case, so if you want to filter samples for which
> this is unlikely to be the case you need to do it before they get
> to the PLL.  The problem with doing this well, however, is that a
> PLL is also destabilised by adding delays to its feedback path,
> causing errors of a different nature, so anything done before the
> PLL is severely limited in the amount of time it can spend doing
> that, and hence the number of samples it can look at to do that.
> Doing better probably requires replacing the PLL; the "replace
> it with what?" question is truly interesting.
> 
> I suspect I've gone well off topic for this list, however, and for
> that I apologize.  I just wanted to make sure it was understood that
> there is an argument for the view that we do not yet know of any
> fundamental limits on the precision that NTP, or a network time
> protocol like NTP, might achieve, so any effort to build NTP servers
> and clients which can make their measurements more precisely is not
> a waste of time.  It instead is what is required to make progress
> in understanding how to do this better.
> 
> Dennis Ferguson
> _______________________________________________
> time-nuts mailing list -- time-nuts at febo.com
> To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
> and follow the instructions there.