OpenNMS?? Release Notes


Published in

Copyright (c) Copyright ?? 2004 Blast Internet Services, Inc.,
www.opennms.org

Preface

OpenNMS is the creation of numerous people and organizations, operating under the umbrella of the OpenNMS project. The original code base was developed and published under the GPL by the Oculan Corporation until 2002, when the project administration was passed on to Tarus Balog.

The current corporate sponsor of OpenNMS is Blast Internet Services, which also owns the OpenNMS trademark.

The OpenNMS Project strives to remain independent, and includes contributions from people outside of Blast. Please visit the OpenNMS website for more information.

OpenNMS is a derivative work, containing both original code, included code and modified code that was published under the GNU General Public License. Please see the source for detailed copyright notices, but some notable copyright owners are listed below:

Please send any omissions or corrections to this document to Tarus Balog

1. Introduction
About This Release

OpenNMS 1.1.3 is a major milestone on the way to the next stable release, 1.2. It contains a number of improvements, especially under the covers, and is the first release created under the new build system.

OpenNMS 1.1.2 adds several new features and bug fixes. New features include a JDBC poller, a script-based poller and script-based event handler, as well as contributed support for a map.

OpenNMS 1.1.1 is the next step towards the production 1.2 release. It contains a number of new features and bug fixes.

OpenNMS 1.1 extends the work that was begun with 1.0 to make OpenNMS more powerful and easier to use. Almost all of the new functionality was suggested by current OpenNMS users. It is hoped that these improvements will prove useful, and will lead to even more suggestions on how to improve the product.

OpenNMS 1.0.2 is a maintenance release that fixes several code issues.

OpenNMS 1.0.1 is a maintenance release that fixes several code issues.

Please, let us know if you have any problems at all at the OpenNMS Bugzilla page.

2. What's New?
Changes in This OpenNMS

OpenNMS 1.1.0 represents a refinement of the functionality introduced in 1.0.0. The 1.1 tree is a development, or "unstable" tree, implementing a number of new features but without the testing that went into 1.0. When 1.1 is mature enough, it will become 1.2, the next production or "stable" release. This release, 1.1.3, is a major step toward that next stable release.

Changes in OpenNMS 1.1.3

A tremendous amount of work has been done "under the covers" to OpenNMS, and the following features were added in 1.1.3:

Support for Duplicate IP Addresses

Prior to this release, the algorithm that OpenNMS used to determine if a particular interface belonged to a particular node was simple. An SNMP walk was done on the device, and all of the IP addresses on that device were associated with the node. If that walk discovered a "duplicate" address, say from a private network or some backup link, it would assume that all of the addresses on that device belonged to the device that was discovered with that IP address first.

This could result in "merged" nodes, especially in environments with HSRP.

This release now supports duplicate IP addresses. The nodes will not be merged and an event will be generated.

Note that networks aren't supposed to have duplicate IP addresses. In other words, if there are two "10.1.1.1" addresses on a network, and OpenNMS sends a "ping" to 10.1.1.1, it will assume that a response means that interface 10.1.1.1 is "up", regardless of which "10.1.1.1" interface responds.

Since this feature was mainly written to support inactive or unreachable interfaces that were discovered by SNMP, this behavior shouldn't present a problem, although it does have the added benefit of being able to monitor highly available IP addresses.

For example, if your website lives at 10.1.1.1, which lives on two devices, as long as an HTTP request to 10.1.1.1 is answered (by either machine) OpenNMS will mark the service as up.

New Asset Configuration "Categories"

The rules uses in categories and filter rules, usually along the lines of

<rule>IPADDR IPLIKE *.*.*.*</rule>
are actually quite flexible, and can be built on almost anything in the database. However it would be nice to easily place a particular device into a category for display on the main page, notifications, etc.

There are four categories:

Note that there is no "hard coded" meaning to these categories, you could use "poller" for "threshold" etc. They are just labeled for convenience.

How would you use them? Well, you would need to modify the <filter> or <rule> tags in the configuration files. Suppose you had two types of polling packages, like "Gold" and "Silver". You would then have a filter like <filter>pollerCategory == "Gold"</filter> for that package. By just adding the name "Gold" or "Silver" to the proper category on the asset screen you can place a particular device into that poller package.

Note that once you have sorted all of your devices, you will need to restart OpenNMS for the poller to reload the proper configuration.

Added an XML RPC daemon

One user of OpenNMS has integrated it into their provisioning/billing/support package. They use multiple instances of OpenNMS to poll the services on their network, and all of these instances talk to a single database. By sending events to eventd, they can affect changes in how these devices are polled (without a restart).

In order to alert this system to events from OpenNMS, like "nodeLostService", we send events out via xmlrpcd.

Unfortunately, there isn't time to describe in detail how this system works, but it will be documented as soon as possible (the hope is by 1.2).

Support for Java 1.4.2

People who used previous versions of OpenNMS on Java 1.4.2 found out that it would use up all of the resources on the system and then die. This turned out to be due to a very obscure bug in Java. The code was re-written to avoid this and now we recommend that OpenNMS is run on 1.4.2.

MIB Compiler for Data Collection

John Rodriguez has created a great MIB Compiler to convert native MIB information into a format that can be used by datacollection-config.xml.

We hope to import into the webUI in the future. But for now it is located in the contrib directory under mibparser.

In that directory is the complete code (in Java) as well as a helpful README. In a nutshell this is how you would use it.

Change into the dist directory and run the parseMib.sh wrapper script. The format is:

Usage: parseMib.sh <MIB File 1> [<MIB file 2>...] Example: parseMib.sh RFC-1213.my

Thus:

$ ./parseMib.sh /usr/share/snmp/mibs/RFC1213-MIB.txt 
    Looking for a good java...
    Using java in user's path...
    Checking Java version for 1.4+...
    Version is: java version "1.4.2_04"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_04-b05)
Java HotSpot(TM) Client VM (build 1.4.2_04-b05, mixed mode)
    Checking for JAVA_HOME...
    JAVA_HOME not set, trying to find it...
    JAVA_HOME set to: .
    Calling parser...

will generate output that is very familiar to people used to modifying datacollection-config.xml. For example:

<mibObj oid=".1.3.6.1.2.1.11.1" instance="0" alias="snmpInPkts" type="Counter" />
<mibObj oid=".1.3.6.1.2.1.11.2" instance="0" alias="snmpOutPkts" type="Counter" />
<mibObj oid=".1.3.6.1.2.1.11.3" instance="0" alias="snmpInBadVersions" type="Counter" />
<mibObj oid=".1.3.6.1.2.1.11.4" instance="0" alias="snmpInBadCommunityNamesTOOLONG" type="Counter" />
<mibObj oid=".1.3.6.1.2.1.11.5" instance="0" alias="snmpInBadCommunityUsesTOOLONG" type="Counter" />
<mibObj oid=".1.3.6.1.2.1.11.6" instance="0" alias="snmpInASNParseErrs" type="Counter" />
<mibObj oid=".1.3.6.1.2.1.11.8" instance="0" alias="snmpInTooBigs" type="Counter" />
<mibObj oid=".1.3.6.1.2.1.11.9" instance="0" alias="snmpInNoSuchNames" type="Counter" />

This could be put into a new MIB group, snmp-stats or some such, directly without having to explore the MIB by hand.

I love this app.

There are some caveats. In using this I have sometimes seen errors where the MIB compiler could not find a referenced variable because it is defined in another MIB file. Simply list it first in the list of MIBs to parse.

I have also come across some MIBs that define custom object types and the parser doesn't handle it all that well. It is often possible just to delete the offending line from the MIB file (after making a copy of course) and try it again.

OpenNMS can only handle numeric data types, or DisplayStrings that can be converted into numbers, so keep that in mind when choosing which values to collect.

We use RRDTool, and RRDTool has a 19 character limit on filenames (the part before .rrd). Since the "alias" field becomes the file name, you cannot have an alias longer than 19 characters. The parser will append "TOOLONG" to overlength aliases, and you can edit them by hand (it would be possible to truncate the name, but you cannot have duplicate aliases and that might occur).

Finally, OpenNMS can handle a numeric instance (0, 1, 2, ... etc.) or an instance of "ifIndex". So an instance of "tcpConnState" would cause an error.

New Build System

Our goal is to make OpenNMS as pure Java as possible. However, for a variety of reasons we can't do that yet. When OpenNMS was started, ant (the program used to build other Java programs) had limitations that had to be worked around. This resulted in a workable, but somewhat confusing, build system.

DJ Gregor (building on work started by Edwin Buck) rebuilt the build system, making it almost pure ant. This was great for those doing development, so hats off to Deej and Edwin.

New NTP Poller

Mike Huot has written a new NTP poller. You'll notice it in capsd-configuration.xml and poller-configuration.xml. Hat off to Mike.

Nice Little Things

Small additions that deserve mention:

Bug Fixes

We have lots of bug fixes. I'm really tired. Check out the CHANGELOG.

2.1. Changes in OpenNMS 1.1.2 and Above

The following features were added in 1.1.2:

Poller Improvements

There are three new pollers available:

Ssh: Previously, the SSH service was polled and discovered using the generic TCP class. This worked fine, except that SSH expects a version to be sent with the query. This causes numerous logs, thus the TCP class was modified into an SSH class that sends the correct version string.

JDBC: The database pollers also use the TCP class to connect to well known ports. Jose Nunez Vicente Zuleta created a poller that uses the particular JDBC database driver to make a connection, get the system catelogs, and if successful, mark the database service as "up". Since this requires a valid username and password that can access the database, it is not the default class, but it is pretty simple to set up.

In order to automatically detect and monitor databases, a few changes need to be made to both the network and the OpenNMS configuration. First, be sure that the username and password you plan to use actually works from the OpenNMS server. This will involve changes to pg_hba.conf for PostgreSQL, and I am not sure about others.

Second, you will need to insure that you have a jar file with the JDBC driver for your particular database. Copy it to $OPENNMS_HOME/lib (the one for PostgreSQL is already included).

Okay, now you need to modify the capsd configuration to discover the service and modify the poller configuration to poll the service.

capsd: Here's an example for Sybase:

<protocol-plugin protocol="Sybase-JDBC" class-name="org.opennms.netmgt.capsd.JDBCPlugin" scan="on">
        <property key="user" value="sa"/>
        <property key="password" value="XXXX"/>
        <property key="retry" value="3"/>
        <property key="timeout" value="5000"/>
        <property key="driver" value="com.sybase.jdbc2.jdbc.SybDriver"/>
        <!-- jdbc:sybase:Tds::/ -->
        <property key="url" value="jdbc:sybase:Tds:OPENNMS_JDBC_HOSTNAME:4100/tempdb"/>
</protocol-plugin>

and one for MySql:

<protocol-plugin protocol="MySQL-JDBC" class-name="org.opennms.netmgt.capsd.JDBCPlugin" scan="on">
        <property key="user" value="root"/>
        <property key="password" value="XXXX"/>
        <property key="retry" value="3"/>
        <property key="timeout" value="5000"/>
        <property key="driver" value="org.gjt.mm.mysql.Driver"/>
        <!-- jdbc:mysql://[<:3306>]/ -->
        <property key="url" value="jdbc:mysql://OPENNMS_JDBC_HOSTNAME:3306/mysql"/>
</protocol-plugin>

and one for PostgreSQL:

<protocol-plugin protocol="PostgreSQL-JDBC" class-name="org.opennms.netmgt.capsd.JDBCPlugin" scan="on">
        <property key="user" value="opennms"/>
        <property key="password" value="opennms"/>
        <property key="retry" value="3"/>
        <property key="timeout" value="5000"/>
        <property key="driver" value="org.postgresql.Driver"/>
        <!-- jdbc:postgresql:[[:<5432>/]] -->
        <property key="url" value="jdbc:postgresql://OPENNMS_JDBC_HOSTNAME:5432/opennms"/> 
</protocol-plugin>

Note that the service names for all three of these examples have "-JDBC" added to the end of their names. This means you can run them separately from the standard database protocols, or if you like, you can completely replace the standard protocols. In fact, if you wish, you can use the standard port check in capsd, and then use the JDBC poller configuration to do the actual polling.

Here are the poller configuration examples:

    <service name="Sybase-JDBC" user-defined="false" interval="6000" status="on">
        <parameter key="user" value="sa"/>
        <parameter key="password" value="XXXX"/>
        <parameter key="timeout" value="3000"/>
        <parameter key="driver" value="com.sybase.jdbc2.jdbc.SybDriver"/>
        <!-- jdbc:sybase:Tds::/ -->
        <parameter key="url" value="jdbc:sybase:Tds:OPENNMS_JDBC_HOSTNAME:4100/tempdb"/>
    </service>
    <service name="MySQL-JDBC" user-defined="false" interval="6000" status="on">
        <parameter key="user" value="root"/>
        <parameter key="password" value="XXXX"/>
        <parameter key="timeout" value="3000"/>
        <parameter key="driver" value="org.gjt.mm.mysql.Driver"/>
        <!-- jdbc:mysql://[<:3306>]/ -->
        <parameter key="url" value="jdbc:mysql:// OPENNMS_JDBC_HOSTNAME:3306/mysql"/>
    </service>
   <service name="PostgreSQL-JDBC" user-defined="false" interval="9000" status="on">
        <parameter key="user" value="opennms"/>
        <parameter key="password" value="opennms"/>
        <parameter key="timeout" value="9000"/>
        <parameter key="driver" value="org.postgresql.Driver"/>
        <!-- jdbc:postgresql:[[:<5432>/]] -->
        <parameter key="url" value="jdbc:postgresql://OPENNMS_JDBC_HOSTNAME:5432/opennms"/>
    </service>

One more thing in the poller-configuration file, you'll need to add <monitor> tags at the bottom:

  <monitor service="Sybase-JDBC" class-name="org.opennms.netmgt.poller.JDBCMonitor"/>
  <monitor service="MySQL-JDBC" class-name="org.opennms.netmgt.poller.JDBCMonitor"/>
  <monitor service="PostgreSQL-JDBC" class-name="org.opennms.netmgt.poller.JDBCMonitor"/>

Hats off to Jose for this work.

General Purpose Script Poller: Bill Ayres has written a poller that will execute a script, and based on the response from that script it will mark the service as being "up" or "down", called the "General Purpose" or "Gp" Poller. He has used it to monitor RADIUS servers, for example.

GpPlugin and GpMonitor work much like TcpPlugin and TcpMonitor in that you can use them to define as many custom services as you need, each with a unique service name.

GpPlugin and GpMonitor call an external script or program to test a particular service. The script will be passed the IP address of the interface OpenNMS is testing ( as --hostname [IP Address]), followed by the timeout (as --timeout [timeout]), followed by any optional arguments that may need to passed.

The script is expected to return a string as standard output which is then compared to the banner property or parameter to determine success or failure of the test.

The timeout is implemented in GpPlugin and GpMonitor. However, some scripts may want to know how long OpenNMS is going to wait for a reply, so the timeout value is passed to the script, and can be ignored by the script if it is not needed.

GpPlugin and GpMonitor also check the exit status of the script or program. If it is not zero, then the test fails. They will also gather and log any standard error output from the script, but the presence of error output does not prevent the test from succeeding if the banner matches the standard output.

Example poller parameters are shown below. All of these are optional except script, which is required, and will cause an exception to be logged if it's missing.

Example plugin properties are also shown below. All of these are optional except script, which is required, and will cause an exception to be logged if it's missing.

These programs use the exec method from Java's Runtime class. Exec is known to have pitfalls. (See When Runtime.exec() won't) Also, exec doesn't have a built-in timeout feature. In deciding what to do about these shortcomings, Bill discovered that Scott McCrory has already done it with his ExecRunner class. ExecRunner and StreamGobbler are at Sourceforge as part of Spumoni.

One more word about the timeout. ExecRunner expects the timeout in integer seconds, not milliseconds, and a value of zero means wait indefinitely. To avoid confusion, Bill maintained the OpenNMS practice of specifying the timeout in milliseconds. Before passing it on to ExecRunner, it gets converted to seconds in the following manner: Zero remains zero, 1 thru 1999 gets converted to 1 second, 2000 thru 2999 -> 2 seconds, 3000 thru 3999 -> 3 seconds, etc.

Included in contrib is a simple perl test script, gptest.pl, that is handy for testing, since it's easy to edit and change its behaviour.

To implement Gp, add the following entries, substituting your information as needed.

For capsd configuration:

<protocol-plugin protocol="GPtest" class-name="org.opennms.netmgt.capsd.GpPlugin" scan="on" user-defined="true">
   <property key="script" value="/opt/OpenNMS/contrib/gptest.pl"/>
   <property key="banner" value="success"/>
   <property key="args" value="caps-arg1 caps-arg2"/>
   <property key="timeout" value="3000"/>
   <property key="retry" value="1"/>
</protocol-plugin>

And for poller configuration:

<service name="GPtest" interval="300000" user-defined="false" status="on">
   <parameter key="script" value="/opt/OpenNMS/contrib/gptest.pl"/>
   <parameter key="banner" value="successful"/>
   <parameter key="args" value="poll-arg1 poll-arg2"/>
   <parameter key="retry" value="1"/>
   <parameter key="timeout" value="2000"/>
   <parameter key="rrd-repository" value="/var/opennms/rrd/response"/>
   <parameter key="ds-name" value="GPtest"/>
</service>

and the monitor service entry:

<monitor service="GPtest"   class-name="org.opennms.netmgt.poller.GpMonitor"/>

Hats off to Bill for this work, and to Scott for ExecRunner.

Script Daemon

Speaking of scripts, Jim Doble has written a daemon that will execute scripts based on events received or generated by OpenNMS, called ScriptD. This process, governed as usual from a configuration file, allows one to generally or specifically execute actions based on events in OpenNMS.

The scripting language, as I understand it, is beanshell. As Jim writes: You will notice that BeanShell is a lot like Java, but with some relaxed syntax. For example you don't have to define types for your variables, and attributes for which there are simple get methods can be accessed as properties (i.e. you can say event.uei or event.getUei() interchangably.)

There are 4 types of scripts that can be run: start-script, reload-script, stop-script, and event-script. When ScriptD starts it will run all of the commands in the <start-script> tags. Likewise, when ScriptD stops, it will run all of the commands in the <stop-script> tags. Also, there is a new event, uei.opennms.org/internal/reloadScriptConfig, which when received will run all of the <reload-script> tags.

The final script type, <event-script> gets run when events are received. Event scripts can have one or more UEI elements, which specify the UEI's for which that script should run. If no UEI element is present, the script will run for all events.

The scripts can make use of the SnmpTrapHelper class, which is a utility to make it easier to manipulate traps from a script.

There is an example scriptd-configuration.xml file included in the $OPENNMS_HOME/etc directory.

If you want to forward all SNMP traps to another machine as an SNMP trap, you would use the following event script:

        <event-script language="beanshell">

                event = bsf.lookupBean("event");

                if (event.snmp != null) {
                        log.debug("forwarding a trap");
                        snmpTrapHelper.forwardTrap(event, "10.1.1.1", 162);
                }

        </event-script>

This will forward the trap to 10.1.1.1, port 162. Note that the event will have SNMP information if the event is indeed an SNMP trap. Since internal OpenNMS events do not, you could use that to forward OpenNMS events as an SNMP trap to another system:

<event-script language="beanshell">

        event = bsf.lookupBean("event");

        if (event.snmp == null)
        {

                try {

                log.debug("Forwarding an OpenNMS event.");

                SnmpPduTrap trap = snmpTrapHelper.createV1Trap(".1.3.6.1.4.1.5813.1", "10.1.1.16", 6, 1, 0);

                t_dbid = new Integer(event.dbid).toString();
                if (t_dbid != null)
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.1", "OctetString", "text", t_dbid);
                else
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.1", "OctetString", "text", "null");
                if (event.distPoller != null)
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.2", "OctetString", "text", event.distPoller);
                else
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.2", "OctetString", "text", "null");
                if (event.creationTime != null)
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.3", "OctetString", "text", event.creationTime);
                else
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.3", "OctetString", "text", "null");
                if (event.masterStation != null)
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.4", "OctetString", "text", event.masterStation);
                else
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.4", "OctetString", "text", "null");
                if (event.uei != null)
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.6", "OctetString", "text", event.uei);
                else
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.6", "OctetString", "text", "null");
                if (event.source != null)
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.7", "OctetString", "text", event.source);
                else
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.7", "OctetString", "text", "null");
                t_nodeid = new Long(event.nodeid).toString();
                if (t_nodeid != null)
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.8", "OctetString", "text", t_nodeid);
                else
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.8", "OctetString", "text", "null");
                if (event.time != null)
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.9", "OctetString", "text", event.time);
                else
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.9", "OctetString", "text", "null");
                if (event.host != null)
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.10", "OctetString", "text", event.host);
                else
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.10", "OctetString", "text", "null");
                t_interface = event.getInterface();
                if (t_interface != null)
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.11", "OctetString", "text", t_interface);
                else
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.11", "OctetString", "text", "null");
                if (event.snmphost != null)
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.12", "OctetString", "text", event.snmphost);
                else
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.12", "OctetString", "text", "forge.blast.com");
                if (event.service != null)
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.13", "OctetString", "text", event.service);
                else
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.13", "OctetString", "text", "null");
                if (event.descr != null)
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.16", "OctetString", "text", event.descr);
                else
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.16", "OctetString", "text", "null");
                if (event.severity != null)
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.18", "OctetString", "text", event.severity);
                else
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.18", "OctetString", "text", "null");
                if (event.pathoutage != null)
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.19", "OctetString", "text", event.pathoutage);
                else
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.19", "OctetString", "text", "null");
                if (event.operinstruct != null)
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.20", "OctetString", "text", event.operinstruct);
                else
                        snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.20", "OctetString", "text", "null");

                snmpTrapHelper.sendTrap("public", trap, "10.1.1.15", 162);

                }

                catch (e) {
                    sw = new StringWriter();
                    pw = new PrintWriter(sw);
                    e.printStackTrace(pw);
                    log.debug(sw.toString());
                }
        }

        </event-script>

This will send an newly defined OpenNMS trap with the important event information embedded as varbinds.

If you wanted to limit the forwarded OpenNMS events to nodeLostService and nodeRegainedService, you can add a <uei> tag:

        <event-script language="beanshell">
                <uei name="uei.opennms.org/nodes/nodeLostService"/>
                <uei name="uei.opennms.org/nodes/nodeRegainedService"/>

To the first part of the <event-script> tag.

Hats off to Jim for this work.

Maps

Okay, let's get this out in the open. I don't care for maps in network management. Yes, they are nice looking, but truely useful maps cannot be automated, and the manual process of generating maps takes more time than they are worth.

That said, my opinions don't mean much in this project (grin) and if someone is willing to put in some work and write solid code I am more than willing to accept it. Thus, Derek Glidden decided to go and write a mapping system for OpenNMS.

This will display the nodes as icons, and the current availability is displayed in color underneath it. You can view it in a tree mode or just as a list of icons, and the image will automatically refresh. The parenting relationships have to be manually set.

The image is built and displayed using Scalable Vector Graphics (SVG). I think this is a great decision, but the downside is that the only SVG viewer I was able to get to work was from Adobe for Internet Explorer on Windows. I was not able to get SVG to work with Mozilla or Safari (on Mac). Using the system on IE was very clean and fast.

There is the option to convert the SVG image to a PNG image. This is extremely processor intensive, takes a long time on a network of any size, and often fails. It is not recommended.

Scared yet? (grin)

For these reasons I am treating the current map implementation as contributed code (i.e. not supported). It is hoped, however, that Derek and others will work to make more improvements to the system.

Okay, to get started, read the map.disable file in $OPENNMS_HOME/etc. You will need to copy this file to map.enable. This will add a "Map" menu item in the WebUI.

You will also have to make some changes to the tomcat4 configuration. First, you need to set headless equal to true, and second you should probably increase the memory available to Tomcat (especially if you are trying to use the SVG to PNG transcoder). OutOfMemory exceptions in Tomcat are indicative of a too small memory setting when trying to render the map.

I know this whole things sounds a bit negative, but that is no reflection on Derek's work. He wrote very clean code and I like the architecture (SVG especially) that he came up with. The icons are cool, too. So hats off to Derek.

But I am bracing myself for the onslaught of questions like "Can I add a background?", "Can I change the icons based on systemOID?", and "Can I make submaps?". Patience, please.

Nice Little Things

The following little changes and improvements have been made:

Bugs

As we move toward the next stable release of OpenNMS, a number of bugs have been fixed, including:

2.2. Changes in OpenNMS 1.1.1 and Above

The following features were added in 1.1.1:

Trap Handling

SNMP Traps will now be associated with nodes if the IP address in the trap matches a known IP address in the database.

If the IP Address is not known, OpenNMS will generate a newSuspect event to attempt to discover the device. This behavior can be disabled in the trapd-configuration.xml file.

Added new trap definitions for Dell OpenManage, Foundry Networks and ADIC. Also added an updated mib2opennms program which improves the look of the output.

Reports

Added a new custom reporting module which allows one to create and save custom performance reports. It is called the Key SNMP Custom (KSC) Reporting Tool.

Added buttons on the standard Performance and Response Time pages to allow the range to be changed between the last Day/Week/Month/Year.

Response Time

Added the ability to collect response time on the following pollers: Citrix, FTP, HTTPS, IMAP, POP3, SMTP and TCP.

The RRAs for Response Time data are now part of the poller configuration file.

Web Improvements

There is now a Response Time link on the node and interface pages.

If a node or interface supports HTTP, there is now a link to that service.

Added a two minute refresh to the event listing page.

Other Features

Added non-blocking I/O to the HTTPS service. Now all monitors and plug-ins should be non-blocking.

If you set the IP Address in a poll-outages calendar to "match-any" it will match all addresses in the poller package that uses that calendar.

Increased the size of the contactinfo field in the usersnotified table, and changed create.sql to make this easier.

Fixed Bugs

Fixed numerous bugs, including 650 where "down" events could be written to the database after the corresponding "up" event. See the CHANGELOG for a full list.

Tomcat4

For a variety of reasons, OpenNMS 1.1.1 and beyond will require Tomcat4 version 4.1.18 or higher.

2.3. Changes in OpenNMS 1.1.0 and Above

There were many changes to OpenNMS between 1.0 and 1.1. Here are a few listed by functional area.

Events and Event Handling

The events and notifications part of OpenNMS saw the most changes with 1.1.0. First, there was a new tag added to the eventconf.xml file called <event-file>. This allows for external files to be included in the event configuration.

Also, the order in which events appear is now strictly enforced. When trying to match an event with an event definition, OpenNMS takes the first match. The events in the eventconf.xml are read first, followed by the files identified by <event-file> tags (in the order in which they are listed). In the configuration that ships with OpenNMS, the file with the default events is loaded last. Be sure to add any custom files before that one.

Prior to this release, the SNMP generic traps 0-5 (coldStart, warmStart, linkDown, etc.) were hard-coded. Now they must be defined (and that definition is included in the default events file), but this allows for generic traps other than type 6 to be configured differently for, say, different hosts.

Speaking of event files, over 2750 events were added out of the box, including those from vendors such as Cisco, HP and 3Com. Please let us know if anything is misconfigured or if we need to add some events.

The ability to configure events based on parameters (varbinds) was also added. This is best demonstrated with an example. In the new HP event definitions there is an event called hpicfFaultFinderTrap. It is defined as:

<event> 
    <mask>
        <maskelement> 
            <mename>id</mename>
            <mevalue>.1.3.6.1.4.1.11.2.14.12.1</mevalue>
        </maskelement> 
        <maskelement>
            <mename>generic</mename>
            <mevalue>6</mevalue> </maskelement>
        <maskelement> 
            <mename>specific</mename>
            <mevalue>5</mevalue> 
        </maskelement>
    </mask>
    <uei>uei.opennms.org/vendor/HP/traps/hpicfFaultFinderTrap</uei>
    <event-label>HP-ICF-FAULT-FINDER-MIB defined trap event: hpicfFaultFinderTrap</event-label> 
    <descr>
      <p>This notification is sent whenever the Fault Finder creates
      an entry in the hpicfFfLogTable.</p> 
      <table>
          <tr> 
          <td><b>hpicfFfLogFaultType</b></td> 
          <td>%parm[#1]%</td> 
          <td><p> badDriver(1) badXcvr(2)
              badCable(3) tooLongCable(4) overBandwidth(5) bcastStorm(6) partition(7)
              misconfiguredSQE(8) polarityReversal(9) networkLoop(10) lossOfLink(11)
              portSecurityViolation(12) backupLinkTransition(13) meshingFault(14)
              fanFault(15) rpsFault(16) stuck10MbFault(17) lossOfStackMember(18)
              hotSwapReboot(19) </p></td> 
          </tr> 
          <tr>
          <td><b> hpicfFfLogAction</b></td>
          <td>%parm[#2]% </td> 
          <td><p;> none(1)
              warn(2) warnAndDisable(3) warnAndSpeedReduce(4)
              warnAndSpeedReduceAndDisable(5) </p></td;> 
          </tr>
          <tr> 
          <td><b>hpicfFfLogSeverity</b></td> 
          <td>%parm[#3]%</td> 
          <td><p> informational(1) medium(2)
              critical(3) </p></td;> 
          </tr> 
          <tr>
          <td><b> hpicfFfFaultInfoURL</b></td>
          <td>%parm[#4]%</td>
          <td><p;></p></td;>
          </tr> 
       </table> 
    </descr> 
    <logmsg dest='logndisplay'><p>HP Event: ICF Hub Fault Found.</p></logmsg>
    <severity>Warning</severity> 
</event>
      

Note that the third parameter denotes the severity of the event. By default this event has a severity of Warning, but what if it was desired to make the "critical" event a severity of Major? Using the new varbind extension to the mask tag:

<event> 
    <mask> 
        <maskelement>
            <mename>id</mename>
            <mevalue>.1.3.6.1.4.1.11.2.14.12.1</mevalue>
        </maskelement> <maskelement>
            <mename>generic</mename>
            <mevalue>6</mevalue> 
        </maskelement>
        <maskelement> <mename>specific</mename>
            <mevalue>5</mevalue> 
        </maskelement>
        <varbind> 
            <vbnumber>specific</vbnumber>
            <vbvalue>5</vbvalue> 
        </varbind> 
    </mask>
    <uei>uei.opennms.org/vendor/HP/traps/hpicfFaultFinderTrap</uei>
    <event-label>HP-ICF-FAULT-FINDER-MIB defined trap event: hpicfFaultFinderTrap</event-label>
    <descr>
        <p>This notification is sent whenever the Fault
           Finder creates an entry in the
           hpicfFfLogTable.</p>
        <table>
            <tr>
            <td><b>hpicfFfLogFaultType</b></td>
            <td>%parm[#1]%</td>
            <td><p> badDriver(1) badXcvr(2)
                badCable(3) tooLongCable(4) overBandwidth(5) bcastStorm(6) partition(7)
                misconfiguredSQE(8) polarityReversal(9) networkLoop(10) lossOfLink(11)
                portSecurityViolation(12) backupLinkTransition(13) meshingFault(14)
                fanFault(15) rpsFault(16) stuck10MbFault(17) lossOfStackMember(18)
                hotSwapReboot(19)</p></td>
            </tr>
            <tr>
            <td><b>hpicfFfLogAction</b></td>
            <td>%parm[#2]%</td>
            <td><p> none(1) warn(2) warnAndDisable(3)
                warnAndSpeedReduce(4) warnAndSpeedReduceAndDisable(5)</p></td>
            </tr>
            <tr>
            <td><b>hpicfFfLogSeverity</b></td>
            <td>%parm[#3]%</td>
            <td><p;> informational(1) medium(2)
                critical(3)</p></td;>
            </tr>
            <tr>
            <td><b>hpicfFfFaultInfoURL</b></td>
            <td>%parm[#4]%</td>
            <td><p;></p></td;>
            </tr>
        </table>
    </descr> 
    <logmsg dest='logndisplay'><p>HP Event: ICF Hub Fault Found.</p></logmsg>
    <severity>Major</severity> </event>
      

This event, when added before the previous event since it is more specific, will try to match on the enterprise id, the generic trap value of 6, the specific trap value of 5 and the value of the third parameter, or varbind, of 3.

There was also the addition of a low and high threshold rearm events. When a threshold is exceeded in consecutive polls equal to the trigger number, the threshold event is generated. Another event will not be generated until the polled value drops below the rearm number. The rearm event is thus similar to a "cleared" event. Since the first parameter passed with the threshold event is the data source name, using the "varbind" tag above, each data source can now have its own event.

One of the more noticeable changes is that the Unique Event Identifier no longer contains "http://". The original intent was that the UEI would act something like an XML namespace, but in practice it is just a label, so the "http://" was removed to avoid confusion.

Notifications also received some attention with this release. Due to popular demand, the tags %nodelabel% and %interfaceresolve% are now available. The former will display the label of the nodeid associated with the event, and the latter will attempt to resolve the name associated with the IP Address of the interface of the event.

In notifd-configuration.xml there are now two new attributes. In the global properties, there is "match-all". By default, this is set to false, which means that the first notification that matches an event will be the only notification sent. If it is set to true, then all notifications that match a given event will be sent. (Thanks Nick) In the auto-acknowledge section, there is a new attribute called "clear". By adding "clear=true" to the auto-acknowledge tag, both the event being auto acknowledged and the event that caused the acknowledgement will be acknowledged. Thus the "up" event that clears a "down" will also be cleared.

In addition to these enhancements, various bugs were fixed. Notification rules now actually work, and you can filter node level events via IP address. Also, threshold events can now generate notifications.

Polling

The biggest change to polling would have to be the addition of response time information for DHCP, DNS, HTTP and ICMP based pollers. Similar to data collection, the response time information can be graphed and it can have threshold alarms placed on it.

Also, all of the plugins and monitors (except HTTPS) have been re-written to use the non-blocking I/O available in the 1.4 JDK.

Discovery

There has been some discussion on how OpenNMS determines node labels. Currently, this is set to the resolved SNMP Primary Interface IP Address. However, it is common practice on routers to have a software-loopback address. OpenNMS will now discover such interfaces (as long as they do not have an address that starts with 127) and mark them as the primary SNMP Interface. Note that no services will be polled on such interfaces.

The Web User Interface

A few changes were made to the WebUI. There is now a webui-colors.xml file that will allow for dynamic changes to the background colors used in the categories list on the main page (more pages to follow). Also under "Admin" the ability to delete nodes was added.

In addition, there is a new Admin page that will allow one to choose which non-IP interfaces will be used in data collection. By setting the snmpStorageFlag in datacollection-config.xml to "select" (now the default), OpenNMS will only store data from those interfaces that could serve as a primary SNMP interface. One can then select which other interfaces to collect on using the GUI. The previous values of snmpStorageFlag ("primary" and "all") still work.

Also, the "Destination Path" interface now has the ability to choose NOT to include a service (thanks Nick) which will create a rule like "match the events where service is NOT FTP", and by placing the mouse over the categories on the main page, the last time the category was updated should be displayed.

Fixed Service Deletion in Downtime Model

The poller downtime model allows for a service to be deleted if it has been down for a certain amount of time. This did not work correctly and has been fixed.

Reduced the Amount of Data Initially Collected from the ifTable

During discovery, the ifTable is collected from each device that is found to support SNMP. On some HP switches, this would fail due to a limitation on the SNMP maximum packet size. All non-essential ifTable elements were removed from the request that appears to resolve the problem.

Removed Spaces in Notification Path Names

Spaces in Notification Path names have been known to cause problems. The Web UI was modified to disallow spaces in path names. Bug 657.

Fixed the AM/PM Ordering on Performance Report UI.

In the Custom Performance Report Web UI, 11 PM was followed by 12 PM, when it should have been 12 AM. This has been corrected. Bug 515.

Added a "contrib" Directory

The "contrib" directory now contains code, such as nifty utilities, that exists outside of the main OpenNMS source but may prove useful. One such example is Tomas Carlsson's "mib2opennms" program. These programs are not supported.

Removed Duplicate Entries in capsd-configuration.xml

Both LDAP and Citrix protocol plug-ins were listed twice. This would slow down the capabilities scan considerably.

Updated Data Collection and Graphing

Added new entries to datacollection-config.xml and snmp-graph.properties.

Bugfixes

Many bugfixes, including allowing Threshold events to generate notifications, AdminStatus and OperStatus values causing exceptions, and rescans with certain devices.

2.4. Changes in OpenNMS 1.0.0 and Above

The following major changes occurred between 0.9.9 and 1.0.0:

OpenSSH service is now "SSH"

The OpenSSH service has been renamed to "SSH" and changed to detect common versions of SSH servers other than OpenSSH. Upgrades will retain the "OpenSSH" service as well for the sake of reports.

"Service Unresponsive" support

There is now the possibility of having a state between "up" and "down" that flags a service as being unresponsive. This state can be reached when the service's port can be connected to, but it doesn't respond in a reasonable amount of time.

Bugfixes

Many small bugfixes, including the "Calculating..." problem if RTC hasn't come up yet when tomcat starts.

3. Known Issues and Caveats
Known Problems And Workarounds In This Release

Here is the list of known issues in this release of OpenNMS.

3.2. New Requirements for Tomcat and PostgreSQL

Version 1.1.2 and beyond of OpenNMS will require at a minimum Tomcat version 4.1.18 and PostgreSQL 7.2. OpenNMS will no longer supply "onms" versions of these applications, and instead will use main distributions from their maintainers.

Note that upgrading these programs is not simple.

Tomcat4

For Tomcat, the best thing to do is uninstall version 4.0 and then install 4.1. Version 4.1 is not seen as an upgrade to 4.0, but is instead seen as a separate product by rpm and apt. You will need to make the following changes to the tomcat4.conf file (located in /etc/tomcat4 on Red Hat):

# you could also override JAVA_HOME here
# Where your java installation lives
# JAVA_HOME="/usr/java/jdk"
# JAVA_HOME="/opt/IBMJava2-131"
JAVA_HOME="[location of your Java Home dir"

# What user should run tomcat
TOMCAT_USER="root"

You do not have to run Tomcat as root if you change the permissions on $OPENNMS/logs and $OPENNMS/etc so that the Tomcat user can write to them.

PostgreSQL

Two main changes need to be made to the Postgres configuration in order to allow OpenNMS to access it properly. Postgres needs to have been started at least once to create the "data" directory that will contain the configuration files.

Edit postgresql.conf (located in /var/pgsql/data on Red Hat) and insure the following values exist:

 tcpip_socket = true 
 max_connections = 256
 shared_buffers = 1024 

Edit pg_hba.conf (host based authentication) to allow all users to access the database from the local host by un-commenting:

# TYPE  DATABASE    USER        IP-ADDRESS        IP-MASK           METHOD

local   all         all                                             trust
host    all         all         127.0.0.1         255.255.255.255   trust

and you may need to uncomment:

# Using sockets credentials for improved security. Not available everywhere,
# but works on Linux, *BSD (and probably some others)

# local  all    all             ident   sameuser

Note that this opens up Postgres to all users on the system (as long as they know the database password). Contact your database administrator if you want to limit this to a specific user, like root.

4. Supported Systems
Supported UNIX-like OSes

OpenNMS is written almost entirely in Java, and should be able to run on any system that suuports the Java 1.4 Virtual Machine. There are requirements for other programs such as PostgreSQL, Perl, RRDTool and Tomcat4, but the 1.4 JDK is the key requirement (as most of the other packages can be compiled from source).

The following are the systems that support or are known to run OpenNMS.

4.1. Fully Supported

The following Linux distributions and other unix-like systems are supported out-of-the-box with native installation packages.

Red Hat Linux 7.x, Red Hat Linux 8, Red Hat Linux 9 and Red Hat Enterprise Linux 3

PostgreSQL 7.2 and later has shipped with Red Hat Linux since version 7.3. Be sure to follow the above instructions.

Fedora Core 1 and Fedora Core 2

OpenNMS is known to build and run on Fedora Core 1 and Fedora Core 2.

Debian Woody on Intel

Debian packages should be available on ftp.opennms.org, and at the following apt-repository:

deb http://debian.opennms.org/apt debian/opennms stable

Special Tomcat 4.1 packages were created, since only 4.0 is supported in stable. For instructions see this How-To written by Ian MacDonald.

Solaris 8 and Solaris 9 on x86 and SPARC

Packages are available at ftp.opennms.org for Solaris 8 and Solaris 9 running on SPARC and Solaris 9 running on x86.

Mandrake 8, Mandrake 9 and Mandrake 10

Please note that while we build packages for Mandrake 8.x, we do not do any formal testing on it. Packages are provided as a convenience.

SuSE 8 and SuSE 9

OpenNMS is known to build and run on SuSE.

MacOSX 10.2

On MacOSX, the Fink distribution packages of OpenNMS are supported. See the Fink web site for more information on installing and using Fink.

Also note that on MacOSX, PostgreSQL must be configured in the same manner as above for Linux. However, to do so you will need to update the SHM settings so that the OS allows enough resources for PostgreSQL to run with larger buffers.

To do so, you must edit /System/Library/StartupItems/SystemTuning/SystemTuning so that the sysctl lines look like so (at a minimum):

 sysctl -w kern.sysv.shmmax=16777216 sysctl -w
      kern.sysv.shmmin=1 sysctl -w kern.sysv.shmmni=128 sysctl -w
      kern.sysv.shmseg=32 sysctl -w kern.sysv.shmall=4096 

4.2. Unsupported

The following UNIX systems are unsupported, but have been known to work.

Debian Woody on SPARC

OpenNMS is known to run on Debian on SPARC

Red Hat 6.2

No special cases known.

Solaris 7

There have been reports of OpenNMS building and running on Solaris on the discuss list.