All posts by Anders Håål

bischeck integration with Nagios remote data processor, NRDP

Until now NSCA has be the only way to integrate passive service checks with Nagios. From a functional perspective it has been working great using the jsendnsca package, http://code.google.com/p/jsendnsca. With the next version we will also support NRDP, Nagios Remote Data Processor. The are some benefits that is nice with NRDP like pure web interface and batch sending of passive checks.

If anyone would like this immediately please send us an email and we could make a 0.4.1 version.

bischeck 0.4.0 released

Today we announce bischeck 0.4.0. The released has during 2 months been tested in a production environment. Upgrading from 0.3.0 and 0.4.0_RC2 is supported. No major changes has been done since RC 2. Full documentation and download is available.

New feature

  • [FR-197] Support for different and multiple integration with different surveillance and monitoring systems. With version 0.4.0 bischeck is not limited to send data to Nagios. It can now send the data to multiple Nagios servers and to other servers like OpenTSB. This is done by moving server formatting and protocol to server integration classes that implements the interface com.ingby.socbox.bischeck.servers.Server. The server integration is described in the xml configuration file servers.xml. This also means that that some Nagios NSCA specific properties previous configured in properties.xml has been moved to the servers.xml file in the NSCA section. The OpenTSDB server class should be regarded as beta.
  • [FR-202] The implementation of running bischeck once, in a none daemon mode, is changed so the same code is used as running in daemon mode. The only difference is that the initialization of triggers are different so all service items are just ran directly and and just once.
  • [FR-204] The bischeck cache will be saved when the bischeck daemon is shutdown and reloaded on bischeck startup. Keeping the cache persistent between restarts is important since 0.4.0 support time based cache retrieval. The limitations is currently that if the bischeck daemon is killed by a signal that can not be caught or the daemon crash the data will not be saved. This will be improved in future versions.
  • [FR-218] The bischeck daemon can now reload the configuration without a process restart. This is support through the JMX operation “reload”. The feature will limit the need of operating system access and authorization.
  • [FR-219] Bischeck can now retrieve state and performance data from a Nagios server supporting livestatus. With the service class LivestatusService a connection is set up over livestatus and with the and serviceitem class LivestatusServiceItem state and/or performance data can be retrieved from the a Nagios service. This can be useful when when creating virtual services in bischeck or used in complex thresholds.
  • [FR-220] Bischeck now support one additional scheduling method where scheduling can be defined to run a service after a different service has executed. This can be useful when a service is depending on data for another service for its thresholds or execution statement.
  • [FR-221] Cache retrieval is now support by using a time offset to find the nearest cache element to the time offset.
  • Cache data can be retrieved as a list of elements based both on index and time.
  • Support for additional mathematical functions like average, min and max calculations on list of elements.
  • Bischeck can now support the usage of cached data in an execution statement of a serviceitem. This is typical useful when a serviceitem execute statement is depending on other service data. For example in a SQL query string:
    select value from table1 where id = host1-web-state[0] and createdate = ’%%yyyy-MM-dd%%’");
  • Added support for other Linux distributions then Redhat based. bischeck should now install on Debian 6 and Ubuntu 10/11.
  • Configuration listing. The configuration listing has been moved from the ConfigurationManager class to the DocManager class. Currently html and text listing is supported. The generated configuration data will by default placed in the bischeckdoc directory.
  • A configured service can be configured not to send its data to a the configured monitoring servers like Nagios. This can be useful if the service is just to be used to create virtual services or just to be used as thresholds.
  • The bischeck script now support JMX authentication. The authentication files are located in the etc directory and named jmxremote.password and jmxremote.access. Default is to that authentication is disabled by the system property
    “-Dcom.sun.management.jmxremote.authenticate=false”. To enable authentication set the property to true. For more info about JMX see
    http://www.oracle.com/technetwork/java/javase/tech/javamanagement-140525.html.

Bugs fixed and important issues

  • The Twenty4Thresholds class was in previous version not doing a correct linear equation calculation if a expression based threshold was defined. Lets illustrate the errors with this example from the 24thresholds.xml configuration file having a mix with static and expression based thresholds.
    ....
    <!-- 12:00 -->
    <hour>7000</hour>
    <!-- 13:00 -->
    <hour>testhost-testservice-testitem[1] / 3</hour>
    <!-- 14:00 -->
    <hour>testhost-testservice-testitem[1] / 2</hour>
    <!-- 15:00 -->
    <hour>testhost-testservice-testitem[1] + 1000 </hour>
    <!-- 16:00 -->
    <hour>12000</hour>
    ....

    In the previous version the threshold value between 12:00 and 13:00 would be null since it was a mix of static and expression based thresholds. And between 15:00 and 16:00 the threshold would have been calculated as “testhost-testservice-testitem[1] + 1000” independent of the time between 15:00 and 16:00.

    Now the linear equation will correctly be calculated with any mix of static and expression based definitions. In the above example the calculated threshold for 12:20 will now be:

    20*((testhost-testservice-testitem[1]/3) - 7000)/60 + 7000
    This fix will improve the correctness and also the capability of threshold adaptivity.
  • The Service interface has a number of new methods that should been there from the beginning. If you developed any service class you need to add these, but if you just inherited ServiceAbstract its fixed for you. The new methods are:
    public NAGIOSSTAT getLevel();
    public void setLevel(NAGIOSSTAT level);
    public boolean isConnectionEstablished();
    public void setConnectionEstablished(boolean connected);
    public Boolean isSendServiceData();
    public setSendServiceData(Boolean sendServiceData);
  • Property cacheclear is renamed to thresholdCacheClear.
  • All the nsca related properties has been moved from properties.xml to servers.xml when used for the NSCAServer class. The new property names has also gone through some minor changes. When upgrading a manual update is needed of the servers.xml file with the current setting of nsca related properties in properties.xml. Recommended that these are later removed.
  • All JAXB generated configuration classes now support serialization.
  • Quartz jar is upgraded from 2.0.1 to 2.1.5.
  • [TR-216] “Shutdown is automatic triggered”
  • [TR-217] “Configuration Manager initialization failed with java.lang.NullPointerException”
  • [TR-207] “sudo in bischeckd script cause problem at boot”

Bischeck on Nagios World Conference 2012

Once again its time for the Nagios World conference in St Paul. This time 3 days with lots of good stuff, http://www.nagios.com/events/nagiosworldconference/northamerica/2012/. If you like to know more about dynamic and adaptive thresholds com and join my presentation on the third day, http://www.nagios.com/events/nagiosworldconference/northamerica/2012/speakers/#ahaal.

Look forward to meet you all in St Paul.

Manage bischeck from bisconf web

To manage the bischeck daemon from the bisconf web ui the following configuration needs to be added to the /etc/sudoers file:

Defaults:username !requiretty
nagios ALL= NOPASSWD: /etc/init.d/bischeckd restart
nagios ALL= NOPASSWD: /etc/init.d/bischeckd start
nagios ALL= NOPASSWD: /etc/init.d/bischeckd stop
nagios ALL= NOPASSWD: /etc/init.d/bischeckd pidstatus

Bisconf 0.1.0 must run on the same server as bischeck. Hopefully this may change in the future.