Quantcast
Channel: Johannes Weber – Weberblog.net
Viewing all articles
Browse latest Browse all 311

Monitoring a DCF77 NTP Server

$
0
0

Now that you’re monitoring the Linux operating system as well as the NTP server basics, it’s interesting to have a look at some more details about the DCF77 receiver. Honestly, there is only one more variable that gives a few details, namely the Clock Status Word and its Event Field. At least you have one more graph in your monitoring system. ;)

This article is one of many blogposts within this NTP series. Please have a look!

There is an event field within the clock status word that consists of these values:

  • 0 nominal
  • 1 no reply to poll
  • 2 bad timecode format
  • 3 hardware or software fault
  • 4 signal loss
  • 5 bad date format
  • 6 bad time format

You can query this via ntpq and its “clockvar” request. Have a look at the first output line. The fourth number after “status=” is the event field, in this example a zero:

weberjoh@vm01-mrtg:/etc/mrtg$ ntpq -c clockvar ntp1.weberlab.de
associd=0 status=0040 , 4 events, clk_unspec,
device="RAW DCF77 CODE (Conrad DCF77 receiver module)",
timecode="-#-####--#--###---M-S12--1-4p-2--1-p1----2--412---1--81---p",
poll=1398, noreply=0, badformat=4, baddata=0, fudgetime1=884.000,
stratum=0, refid=DCFa, flags=0,
refclock_time="e03df9af.00000000  Thu, Mar 21 2019 11:53:19.000",
refclock_status="TIME CODE; (LEAP INDICATION; CALLBIT)",
refclock_format="RAW DCF77 Timecode",
refclock_states="*NOMINAL: 1d+00:43:00 (99.54%); BAD FORMAT: 00:06:51 (0.45%); running time: 1d+00:49:51"

Using grep ‘n sed again you can exempt this single event field id from all the others:

weberjoh@vm01-mrtg:/etc/mrtg$ ntpq -c clockvar ntp1.weberlab.de | grep associd | sed s/.*status....// | sed s/,.*//
0

In the end, my MRTG config for this looks like this:

###############################################################
################ Clock Status für DCF77 0-6 ###################
###############################################################
Target[ntp1-dcf77-clockvar]: `ntpq -c clockvar ntp1.weberlab.de | grep associd | sed s/.*status....// | sed s/,.*// && echo 0`
MaxBytes[ntp1-dcf77-clockvar]: 6
Title[ntp1-dcf77-clockvar]: Status Code Clockvar -- ntp1-dcf77
Options[ntp1-dcf77-clockvar]: gauge integer
YLegend[ntp1-dcf77-clockvar]: Status Code Clockvar 0-6
Legend1[ntp1-dcf77-clockvar]: Status Code
Legend3[ntp1-dcf77-clockvar]: Peak Status Code
LegendI[ntp1-dcf77-clockvar]: Status:
ShortLegend[ntp1-dcf77-clockvar]:  
routers.cgi*Options[ntp1-dcf77-clockvar]: fixunit nototal noo nomax
routers.cgi*ShortDesc[ntp1-dcf77-clockvar]: Status DCF77 Clockvar ntp1-dcf77
#routers.cgi*WithPeak[ntp1-dcf77-clockvar]: none
routers.cgi*Icon[ntp1-dcf77-clockvar]: tick-sm.gif
routers.cgi*Comment[ntp1-dcf77-clockvar]: 0=nominal, 1=no reply, 2=bad timecode, 3=hard or software fault, 4=signal loss, 5=bad date, 6=bad time

Note that I am using the “routers.cgi*Comment” option to display a single line with some comments underneath the RRD graphs:

Of course, this clockvar graph correlates with other stats out of the DCF77 NTP server, such as the reach graph or the iostats. For example, I had this “bad timecode” in the above graph for about 1 hour. Similarly, the other graphs show some suspicious values as well, while you know now the actual root cause, namely “bad timecode”.

However, to be honest, I am not looking at this clockvar graph that much, since the reach already shows me whether my server is up and running or not. But, you know, because I can… ;)

Featured image “Dusty” by Thomas Hawk is licensed under CC BY-NC 2.0.


Viewing all articles
Browse latest Browse all 311

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>