[time-nuts] clock-block any need ?

Fri Dec 28 19:54:53 UTC 2012

On 27 Dec, 2012, at 11:28 , Attila Kinali <attila at kinali.ch> wrote:
> On Thu, 27 Dec 2012 10:55:12 -0800
> Dennis Ferguson <dennis.c.ferguson at gmail.com> wrote:
> 
>> I don't think I buy this.  It takes 70 milliseconds for a signal
>> transmitted from a GPS satellite to be received on the ground, but
>> we don't use this fact to argue that sub-70 ms timing from GPS is
>> not possible.  If you have a network of high-bandwidth routers and
>> switches doing forwarding in hardware, and carrying no traffic other
>> than the packets you are timing (I have access to lab setups that
>> can meet this description) you can observe packet delivery times that
>> are stable at well under the microsecond level even though the total
>> time required to deliver a packet is much larger.
> 
> I'm not sure about this. Knowing about how switches work internally,
> i'd guess they have "jitter" of something in the range of 1-10us, but
> i've never done any measurements. Have you any hard numbers?

I've measured it for large routers, but the numbers are not mine.  In
a former life I helped design forwarding path ASICs.

I'm interested in what that guess is based on, however, since I can't
imagine where 1-10us of self-generated jitter from an ethernet switch
would come from, if not from competing traffic.  A well-spec'd piece of
silicon to handle 20 Gbps of full-duplex bandwidth needs to be capable
of processing about 40 million packet arrivals per second, or about
one packet every 25 ns.  That's pretty much what is needed to build
a good ~$200, 24 port gigabit ethernet switch. The cheapest hardware
forwarding path to implement, which is generally what you'll find in
there, is a fixed processing pipeline (or pipelines) that takes packets
in at the required rate and spits out the results at that rate delayed by
N chip clock cycles; N might be large (but not too large; N tells you
how many packets it needs to be able to have in process simultaneously
and it is cheaper in logic if you can minimize that number) but it is a
constant.  Your jitter estimate implies that such a switch, even when
not occupied with other traffic, will either sometimes leave a packet
sitting around for between 40 and 400 packet arrival times before getting
around to doing something with it, or else will sometimes do between 40
and 400 packet arrival times worth of extra work to forward the thing.
My experience with this suggests that it is actually easier to build if
it doesn't work like that.  The switch I recently bought for my house,
this one

  <http://www.netgear.com/business/products/switches/prosafe-plus-switches/JGS524E.aspx#>

specifies the total latency (that's total time, not jitter) through the
switch at 4.1 us for 64 byte packets, a precision I expect they
arrived at by just adding up the store-and-forward and fixed pipeline
delays.  Nearly all of the variation in delay is from competing traffic

Even if 1-10us was observed for individual samples, however, that is
still missing the point.  The interesting number is not the variability
of individual samples, it is the stability of the measure of central
tendency derived from many such samples (e.g. the average, if the
variation were gaussian) that is the interesting number.

>> If you add competing
>> traffic, like real life networks, the packet-to-packet variability
>> becomes much worse, but this is sample noise that can be addressed
>> by taking larger numbers of samples and filtering based on the expected
>> statistics of that noise.
> 
> Here lies the big problem. While with GPS we pretty much know what
> the time is that the signal takes to reach earth, we have no clue
> with network packets in a loaded network. We don't even have an
> idea what the packet transmit distribution is in the moment we are
> doing our measurements. Neither the queue length in the router/switch
> nor anything else. And the loading of a switch changes momentarily
> and this in turn changes the delay of our packets. You can of course
> apply math and try to get rid of quite a bit of noise, but you will
> never get rid of it down to ns levels.

?? NTP is a two-way time transfer.  We directly measure how long the
cumulative queue lengths are for the round trip for each sample, and we
hence directly measure how this changes from sample to sample.  There are
also good statistical models for the average behaviour of such queues when
operating at traffic levels where packet losses are rare and where the
bandwidth is not being significantly consumed by a small number of large,
correlated, flows, which is the usual operating state for both local
networks and Internet backbones (it is usually access circuits that are
the problem) and there are heuristics one can use to determine when the
statistics are not likely to be so nice; these are of use when designing
the thing which has the queues.  What we haven't had is hosts and servers
capable of making precise measurements either of packet arrivals and
departures (why is a ping round trip reported to be 200 us or 400 us
when the packet spends less than 50 us in the network between the machines?),
nor of external reference time sources like GPS nor, really, any good
way to measure, and hence improve, the quality of the end result we
want, which is the time on the client's clock.

Since we're now starting to see computers with peripherals which address
some of these measurement problems really well (hardware time stamping
for packets, hardware PPS timestamp capture) at the small 10's of
nanoseconds level, what bothers me is the argument that there is no
use trying to make use of this, other than for timenut bragging purposes,
since NTP can't operate at anywhere near that level.  To me this argument
is near perfect in its circularity.

> If i'm not mistaken, IEEE1588v1 had exactly that problem. They tried to
> use "normal" switches and hoped the jitter would be predictable enough to
> get compensated for. This didnt work out, so v2 now demands that all
> switches act as border clocks

Yes, NTP will never match a properly implemented PTP, but then again the
claims for what a properly implemented PTP can do still leave a lot of
room between there and a microsecond.

While PTP was originally conceived as a consumer networking thing, note that
the major use of PTP, and one driving its design, has turned out to be in
telecommunications networks where the replacement of traditional, finely-clocked,
carrier circuits with ethernet for backhaul has deprived the thing at the far
end of the backhaul circuit (say, a GSM/UMTS base station) of the frequency
reference it formerly relied on.  The requirements for this application are
stringent enough that the failure of 1588v1 to meet them cannot be construed
as saying anything of practical importance about the ability of something
that works like 1588v1 to set your computer's clock, other than it won't do
as well as a well done 1588v2.

>> As this level of synchronization is
>> usually achieved by the brute force method of measuring transit times
>> across every network device on the path from source to destination I
>> have no doubt that what NTP can do will necessarily be worse than this,
>> but I don't know of a basis that would predict whether NTP's "worse"
>> is necessarily going to be 10,000x worse or can be just 10x worse.
>> Knowing that would require actually trying it to measure what can be
>> done.
> 
> You can guestimate that getting below 200us is not easy in a normal
> network, but sub-1ms should be possible unless the network is very loaded.

So how did you compute the 200 us guess?  I know of no basis for that
prediction.

If you look in Dr. Mills's NTP book, towards the end, you'll find a
plot of the Allan deviation of several apparently perfectly vanilla
computer clocks against an NTP reference (i.e. across a network).  This
is a quite old result (the better machine is a DEC Alpha) so the NTP
timestamps are certainly being taken in software using 1990's computer
technology.  The minimum Allan deviation is about 10^-8 at about 1000
seconds, not numbers that are going to impress anyone, but numbers that
are still the raw material for an average 10 us clock maintained with an
NTP time reference, with an old system and a nothing-special clock (I think
the machines must have been kept in an air conditioned room to eliminate
systematic oscillator variations well enough to produce such a pretty
plot, though).  And, in fact, the 10 us might well be in part reflecting
the stability of the NTP server clock at the state of the art then,
rather than the network, so the number with a more precise server and
the same network might have been better still.

So the logical question might be why these measurements indicate that he
had the raw material for a 10 us, NTP synchronized clock, but one seldom
seems to see anything that good when running ntpd?  I guess I'd just
point out that the difference between ntpd and the Allan deviation
measurements he shows is that ntpd wasn't running when the latter were
made; the difference is ntpd.  What this suggests to me is that either
the things ntpd shows you about what it is doing do not reflect the
actual quality of its end product (the synchonization of the computer's
clock), or that ntpd does not make good use of the raw material available
to it.  In either case, if you are making your prediction by looking
at what ntpd says and trying to extrapolate from that to what is possible
(or even what is currently happening) you may be fooling yourself.

There are still many things to learn here.

Dennis Ferguson