OpenNMS?? Release Notes
|
Preface |
OpenNMS is the creation of numerous people and organizations, operating under the umbrella of the OpenNMS project. The original code base was developed and published under the GPL by the Oculan Corporation until 2002, when the project administration was passed on to Tarus Balog.
The current corporate sponsor of OpenNMS is Blast Internet Services, which also owns the OpenNMS trademark.
The OpenNMS Project strives to remain independent, and includes contributions from people outside of Blast. Please visit the OpenNMS website for more information.
OpenNMS is a derivative work, containing both original code, included code and modified code that was published under the GNU General Public License. Please see the source for detailed copyright notices, but some notable copyright owners are listed below:
Copyright ?? 2002-2004 Blast Internet Services, Inc.
Original code base for OpenNMS version 1.0.0 ?? 1999-2001 Oculan Corporation.
Mapping code Copyright ?? 2003 Networked Knowledge Systems, Inc.
ScriptD code Copyright ?? 2003 Tavve Software Company.
1. Introduction |
About This Release |
OpenNMS 1.1.3 is a major milestone on the way to the next stable release, 1.2. It contains a number of improvements, especially under the covers, and is the first release created under the new build system.
OpenNMS 1.1.2 adds several new features and bug fixes. New features include a JDBC poller, a script-based poller and script-based event handler, as well as contributed support for a map.
OpenNMS 1.1.1 is the next step towards the production 1.2 release. It contains a number of new features and bug fixes.
OpenNMS 1.1 extends the work that was begun with 1.0 to make OpenNMS more powerful and easier to use. Almost all of the new functionality was suggested by current OpenNMS users. It is hoped that these improvements will prove useful, and will lead to even more suggestions on how to improve the product.
OpenNMS 1.0.2 is a maintenance release that fixes several code issues.
OpenNMS 1.0.1 is a maintenance release that fixes several code issues.
Please, let us know if you have any problems at all at the OpenNMS Bugzilla page.
2. What's New? |
Changes in This OpenNMS |
OpenNMS 1.1.0 represents a refinement of the functionality introduced in 1.0.0. The 1.1 tree is a development, or "unstable" tree, implementing a number of new features but without the testing that went into 1.0. When 1.1 is mature enough, it will become 1.2, the next production or "stable" release. This release, 1.1.3, is a major step toward that next stable release.
Changes in OpenNMS 1.1.3
A tremendous amount of work has been done "under the covers" to OpenNMS, and the following features were added in 1.1.3:
Support for Duplicate IP AddressesPrior to this release, the algorithm that OpenNMS used to determine if a particular interface belonged to a particular node was simple. An SNMP walk was done on the device, and all of the IP addresses on that device were associated with the node. If that walk discovered a "duplicate" address, say from a private network or some backup link, it would assume that all of the addresses on that device belonged to the device that was discovered with that IP address first.
This could result in "merged" nodes, especially in environments with HSRP.
This release now supports duplicate IP addresses. The nodes will not be merged and an event will be generated.
Note that networks aren't supposed to have duplicate IP addresses. In other words, if there are two "10.1.1.1" addresses on a network, and OpenNMS sends a "ping" to 10.1.1.1, it will assume that a response means that interface 10.1.1.1 is "up", regardless of which "10.1.1.1" interface responds.
Since this feature was mainly written to support inactive or unreachable interfaces that were discovered by SNMP, this behavior shouldn't present a problem, although it does have the added benefit of being able to monitor highly available IP addresses.
For example, if your website lives at 10.1.1.1, which lives on two devices, as long as an HTTP request to 10.1.1.1 is answered (by either machine) OpenNMS will mark the service as up.
New Asset Configuration "Categories"The rules uses in categories and filter rules, usually along the lines of
<rule>IPADDR IPLIKE *.*.*.*</rule>are actually quite flexible, and can be built on almost anything in the database. However it would be nice to easily place a particular device into a category for display on the main page, notifications, etc.
There are four categories:
Display Category (database field displayCategory): This is to be used for grouping devices into a particular category.
Poller Category (database field pollerCategory): This is to be used to define devices in a particular poller package.
Notification Category (database field notifyCategory): This could be something like "serverAdmin" or "networkAdmin" to be used for directing notifications.
Threshold Category (database field thresholdCategory): This is to be used to define devices in a particular thresholding package.
Note that there is no "hard coded" meaning to these categories, you could use "poller" for "threshold" etc. They are just labeled for convenience.
How would you use them? Well, you would need to modify the <filter> or <rule> tags in the configuration files. Suppose you had two types of polling packages, like "Gold" and "Silver". You would then have a filter like <filter>pollerCategory == "Gold"</filter> for that package. By just adding the name "Gold" or "Silver" to the proper category on the asset screen you can place a particular device into that poller package.
Note that once you have sorted all of your devices, you will need to restart OpenNMS for the poller to reload the proper configuration.
Added an XML RPC daemonOne user of OpenNMS has integrated it into their
provisioning/billing/support package. They use multiple instances of
OpenNMS to poll the services on their network, and all of these instances
talk to a single database. By sending events to
eventd
, they can affect changes in how these devices
are polled (without a restart).
In order to alert this system to events from OpenNMS, like
"nodeLostService", we send events out via
xmlrpcd
.
Unfortunately, there isn't time to describe in detail how this system works, but it will be documented as soon as possible (the hope is by 1.2).
Support for Java 1.4.2People who used previous versions of OpenNMS on Java 1.4.2 found out that it would use up all of the resources on the system and then die. This turned out to be due to a very obscure bug in Java. The code was re-written to avoid this and now we recommend that OpenNMS is run on 1.4.2.
MIB Compiler for Data CollectionJohn Rodriguez has created a great MIB Compiler to convert native MIB information into a format that can be used by datacollection-config.xml.
We hope to import into the webUI in the future. But for now it is
located in the contrib
directory under
mibparser
.
In that directory is the complete code (in Java) as well as a helpful README. In a nutshell this is how you would use it.
Change into the dist directory and run the parseMib.sh wrapper script. The format is:
Usage: parseMib.sh <MIB File 1> [<MIB file 2>...] Example: parseMib.sh RFC-1213.my
Thus:
$ ./parseMib.sh /usr/share/snmp/mibs/RFC1213-MIB.txt Looking for a good java... Using java in user's path... Checking Java version for 1.4+... Version is: java version "1.4.2_04" Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_04-b05) Java HotSpot(TM) Client VM (build 1.4.2_04-b05, mixed mode) Checking for JAVA_HOME... JAVA_HOME not set, trying to find it... JAVA_HOME set to: . Calling parser...
will generate output that is very familiar to people used to modifying datacollection-config.xml. For example:
<mibObj oid=".1.3.6.1.2.1.11.1" instance="0" alias="snmpInPkts" type="Counter" /> <mibObj oid=".1.3.6.1.2.1.11.2" instance="0" alias="snmpOutPkts" type="Counter" /> <mibObj oid=".1.3.6.1.2.1.11.3" instance="0" alias="snmpInBadVersions" type="Counter" /> <mibObj oid=".1.3.6.1.2.1.11.4" instance="0" alias="snmpInBadCommunityNamesTOOLONG" type="Counter" /> <mibObj oid=".1.3.6.1.2.1.11.5" instance="0" alias="snmpInBadCommunityUsesTOOLONG" type="Counter" /> <mibObj oid=".1.3.6.1.2.1.11.6" instance="0" alias="snmpInASNParseErrs" type="Counter" /> <mibObj oid=".1.3.6.1.2.1.11.8" instance="0" alias="snmpInTooBigs" type="Counter" /> <mibObj oid=".1.3.6.1.2.1.11.9" instance="0" alias="snmpInNoSuchNames" type="Counter" />
This could be put into a new MIB group,
snmp-stats
or some such, directly without having to
explore the MIB by hand.
I love this app.
There are some caveats. In using this I have sometimes seen errors where the MIB compiler could not find a referenced variable because it is defined in another MIB file. Simply list it first in the list of MIBs to parse.
I have also come across some MIBs that define custom object types and the parser doesn't handle it all that well. It is often possible just to delete the offending line from the MIB file (after making a copy of course) and try it again.
OpenNMS can only handle numeric data types, or DisplayStrings that can be converted into numbers, so keep that in mind when choosing which values to collect.
We use RRDTool, and RRDTool has a 19 character limit on filenames
(the part before .rrd
). Since the "alias" field
becomes the file name, you cannot have an alias longer than 19 characters.
The parser will append "TOOLONG" to overlength aliases, and you can edit
them by hand (it would be possible to truncate the name, but you cannot
have duplicate aliases and that might occur).
Finally, OpenNMS can handle a numeric instance (0, 1, 2, ... etc.)
or an instance of "ifIndex". So an instance of
"tcpConnState
" would cause an error.
Our goal is to make OpenNMS as pure Java as possible. However, for a
variety of reasons we can't do that yet. When OpenNMS was started,
ant
(the program used to build other Java programs)
had limitations that had to be worked around. This resulted in a workable,
but somewhat confusing, build system.
DJ Gregor (building on work started by Edwin Buck) rebuilt the build
system, making it almost pure ant
. This was great for
those doing development, so hats off to Deej and Edwin.
Mike Huot has written a new NTP poller. You'll notice it in capsd-configuration.xml and poller-configuration.xml. Hat off to Mike.
Nice Little ThingsSmall additions that deserve mention:
APC data collection was added.
Added "maxval" and "minval" attributed to the
mibObj
definition in
datacollection-config.xml
to help eliminate
spikes.
Started improving start up times on large systems.
Added a sort to KSC reports.
Added an initial delay to notification paths.
We have lots of bug fixes. I'm really tired. Check out the CHANGELOG.
2.1. Changes in OpenNMS 1.1.2 and Above
The following features were added in 1.1.2:
Poller ImprovementsThere are three new pollers available:
Ssh: Previously, the SSH service was polled and discovered using the generic TCP class. This worked fine, except that SSH expects a version to be sent with the query. This causes numerous logs, thus the TCP class was modified into an SSH class that sends the correct version string.
JDBC: The database pollers also use the TCP class to connect to well known ports. Jose Nunez Vicente Zuleta created a poller that uses the particular JDBC database driver to make a connection, get the system catelogs, and if successful, mark the database service as "up". Since this requires a valid username and password that can access the database, it is not the default class, but it is pretty simple to set up.
In order to automatically detect and monitor databases, a few changes need to be made to both the network and the OpenNMS configuration. First, be sure that the username and password you plan to use actually works from the OpenNMS server. This will involve changes to pg_hba.conf for PostgreSQL, and I am not sure about others.
Second, you will need to insure that you have a jar file with the JDBC driver for your particular database. Copy it to $OPENNMS_HOME/lib (the one for PostgreSQL is already included).
Okay, now you need to modify the capsd configuration to discover the service and modify the poller configuration to poll the service.
capsd: Here's an example for Sybase:
<protocol-plugin protocol="Sybase-JDBC" class-name="org.opennms.netmgt.capsd.JDBCPlugin" scan="on"> <property key="user" value="sa"/> <property key="password" value="XXXX"/> <property key="retry" value="3"/> <property key="timeout" value="5000"/> <property key="driver" value="com.sybase.jdbc2.jdbc.SybDriver"/> <!-- jdbc:sybase:Tds::/ --> <property key="url" value="jdbc:sybase:Tds:OPENNMS_JDBC_HOSTNAME:4100/tempdb"/> </protocol-plugin>
and one for MySql:
<protocol-plugin protocol="MySQL-JDBC" class-name="org.opennms.netmgt.capsd.JDBCPlugin" scan="on"> <property key="user" value="root"/> <property key="password" value="XXXX"/> <property key="retry" value="3"/> <property key="timeout" value="5000"/> <property key="driver" value="org.gjt.mm.mysql.Driver"/> <!-- jdbc:mysql://[<:3306>]/ --> <property key="url" value="jdbc:mysql://OPENNMS_JDBC_HOSTNAME:3306/mysql"/> </protocol-plugin>
and one for PostgreSQL:
<protocol-plugin protocol="PostgreSQL-JDBC" class-name="org.opennms.netmgt.capsd.JDBCPlugin" scan="on"> <property key="user" value="opennms"/> <property key="password" value="opennms"/> <property key="retry" value="3"/> <property key="timeout" value="5000"/> <property key="driver" value="org.postgresql.Driver"/> <!-- jdbc:postgresql:[[:<5432>/]] --> <property key="url" value="jdbc:postgresql://OPENNMS_JDBC_HOSTNAME:5432/opennms"/> </protocol-plugin>
Note that the service names for all three of these examples have "-JDBC" added to the end of their names. This means you can run them separately from the standard database protocols, or if you like, you can completely replace the standard protocols. In fact, if you wish, you can use the standard port check in capsd, and then use the JDBC poller configuration to do the actual polling.
Here are the poller configuration examples:
<service name="Sybase-JDBC" user-defined="false" interval="6000" status="on"> <parameter key="user" value="sa"/> <parameter key="password" value="XXXX"/> <parameter key="timeout" value="3000"/> <parameter key="driver" value="com.sybase.jdbc2.jdbc.SybDriver"/> <!-- jdbc:sybase:Tds::/ --> <parameter key="url" value="jdbc:sybase:Tds:OPENNMS_JDBC_HOSTNAME:4100/tempdb"/> </service>
<service name="MySQL-JDBC" user-defined="false" interval="6000" status="on"> <parameter key="user" value="root"/> <parameter key="password" value="XXXX"/> <parameter key="timeout" value="3000"/> <parameter key="driver" value="org.gjt.mm.mysql.Driver"/> <!-- jdbc:mysql://[<:3306>]/ --> <parameter key="url" value="jdbc:mysql:// OPENNMS_JDBC_HOSTNAME:3306/mysql"/> </service>
<service name="PostgreSQL-JDBC" user-defined="false" interval="9000" status="on"> <parameter key="user" value="opennms"/> <parameter key="password" value="opennms"/> <parameter key="timeout" value="9000"/> <parameter key="driver" value="org.postgresql.Driver"/> <!-- jdbc:postgresql:[[:<5432>/]] --> <parameter key="url" value="jdbc:postgresql://OPENNMS_JDBC_HOSTNAME:5432/opennms"/> </service>
One more thing in the poller-configuration file, you'll need to add <monitor> tags at the bottom:
<monitor service="Sybase-JDBC" class-name="org.opennms.netmgt.poller.JDBCMonitor"/> <monitor service="MySQL-JDBC" class-name="org.opennms.netmgt.poller.JDBCMonitor"/> <monitor service="PostgreSQL-JDBC" class-name="org.opennms.netmgt.poller.JDBCMonitor"/>
Hats off to Jose for this work.
General Purpose Script Poller: Bill Ayres has written a poller that will execute a script, and based on the response from that script it will mark the service as being "up" or "down", called the "General Purpose" or "Gp" Poller. He has used it to monitor RADIUS servers, for example.
GpPlugin and GpMonitor work much like TcpPlugin and TcpMonitor in that you can use them to define as many custom services as you need, each with a unique service name.
GpPlugin and GpMonitor call an external script or program to test a particular service. The script will be passed the IP address of the interface OpenNMS is testing ( as --hostname [IP Address]), followed by the timeout (as --timeout [timeout]), followed by any optional arguments that may need to passed.
The script is expected to return a string as standard output which is then compared to the banner property or parameter to determine success or failure of the test.
The timeout is implemented in GpPlugin and GpMonitor. However, some scripts may want to know how long OpenNMS is going to wait for a reply, so the timeout value is passed to the script, and can be ignored by the script if it is not needed.
GpPlugin and GpMonitor also check the exit status of the script or program. If it is not zero, then the test fails. They will also gather and log any standard error output from the script, but the presence of error output does not prevent the test from succeeding if the banner matches the standard output.
Example poller parameters are shown below. All of these are optional except script, which is required, and will cause an exception to be logged if it's missing.
Example plugin properties are also shown below. All of these are optional except script, which is required, and will cause an exception to be logged if it's missing.
These programs use the exec method from Java's Runtime class. Exec is known to have pitfalls. (See When Runtime.exec() won't) Also, exec doesn't have a built-in timeout feature. In deciding what to do about these shortcomings, Bill discovered that Scott McCrory has already done it with his ExecRunner class. ExecRunner and StreamGobbler are at Sourceforge as part of Spumoni.
One more word about the timeout. ExecRunner expects the timeout in integer seconds, not milliseconds, and a value of zero means wait indefinitely. To avoid confusion, Bill maintained the OpenNMS practice of specifying the timeout in milliseconds. Before passing it on to ExecRunner, it gets converted to seconds in the following manner: Zero remains zero, 1 thru 1999 gets converted to 1 second, 2000 thru 2999 -> 2 seconds, 3000 thru 3999 -> 3 seconds, etc.
Included in contrib is a simple perl test script,
gptest.pl
, that is handy for testing, since it's easy
to edit and change its behaviour.
To implement Gp, add the following entries, substituting your information as needed.
For capsd configuration:
<protocol-plugin protocol="GPtest" class-name="org.opennms.netmgt.capsd.GpPlugin" scan="on" user-defined="true"> <property key="script" value="/opt/OpenNMS/contrib/gptest.pl"/> <property key="banner" value="success"/> <property key="args" value="caps-arg1 caps-arg2"/> <property key="timeout" value="3000"/> <property key="retry" value="1"/> </protocol-plugin>
And for poller configuration:
<service name="GPtest" interval="300000" user-defined="false" status="on"> <parameter key="script" value="/opt/OpenNMS/contrib/gptest.pl"/> <parameter key="banner" value="successful"/> <parameter key="args" value="poll-arg1 poll-arg2"/> <parameter key="retry" value="1"/> <parameter key="timeout" value="2000"/> <parameter key="rrd-repository" value="/var/opennms/rrd/response"/> <parameter key="ds-name" value="GPtest"/> </service>
and the monitor service entry:
<monitor service="GPtest" class-name="org.opennms.netmgt.poller.GpMonitor"/>
Hats off to Bill for this work, and to Scott for ExecRunner.
Script DaemonSpeaking of scripts, Jim Doble has written a daemon that will execute scripts based on events received or generated by OpenNMS, called ScriptD. This process, governed as usual from a configuration file, allows one to generally or specifically execute actions based on events in OpenNMS.
The scripting language, as I understand it, is beanshell. As Jim writes: You will notice that BeanShell is a lot like Java, but with some relaxed syntax. For example you don't have to define types for your variables, and attributes for which there are simple get methods can be accessed as properties (i.e. you can say event.uei or event.getUei() interchangably.)
There are 4 types of scripts that can be run: start-script,
reload-script, stop-script, and event-script. When ScriptD starts it will
run all of the commands in the <start-script>
tags. Likewise, when ScriptD stops, it will run all of the commands in the
<stop-script>
tags. Also, there is a new event,
uei.opennms.org/internal/reloadScriptConfig
, which when
received will run all of the <reload-script>
tags.
The final script type, <event-script>
gets
run when events are received. Event scripts can have one or more UEI
elements, which specify the UEI's for which that script should run. If no
UEI element is present, the script will run for all events.
The scripts can make use of the SnmpTrapHelper
class, which is a utility to make it easier to manipulate traps from a
script.
There is an example scriptd-configuration.xml
file included in the $OPENNMS_HOME/etc
directory.
If you want to forward all SNMP traps to another machine as an SNMP trap, you would use the following event script:
<event-script language="beanshell"> event = bsf.lookupBean("event"); if (event.snmp != null) { log.debug("forwarding a trap"); snmpTrapHelper.forwardTrap(event, "10.1.1.1", 162); } </event-script>
This will forward the trap to 10.1.1.1, port 162. Note that the event will have SNMP information if the event is indeed an SNMP trap. Since internal OpenNMS events do not, you could use that to forward OpenNMS events as an SNMP trap to another system:
<event-script language="beanshell"> event = bsf.lookupBean("event"); if (event.snmp == null) { try { log.debug("Forwarding an OpenNMS event."); SnmpPduTrap trap = snmpTrapHelper.createV1Trap(".1.3.6.1.4.1.5813.1", "10.1.1.16", 6, 1, 0); t_dbid = new Integer(event.dbid).toString(); if (t_dbid != null) snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.1", "OctetString", "text", t_dbid); else snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.1", "OctetString", "text", "null"); if (event.distPoller != null) snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.2", "OctetString", "text", event.distPoller); else snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.2", "OctetString", "text", "null"); if (event.creationTime != null) snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.3", "OctetString", "text", event.creationTime); else snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.3", "OctetString", "text", "null"); if (event.masterStation != null) snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.4", "OctetString", "text", event.masterStation); else snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.4", "OctetString", "text", "null"); if (event.uei != null) snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.6", "OctetString", "text", event.uei); else snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.6", "OctetString", "text", "null"); if (event.source != null) snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.7", "OctetString", "text", event.source); else snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.7", "OctetString", "text", "null"); t_nodeid = new Long(event.nodeid).toString(); if (t_nodeid != null) snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.8", "OctetString", "text", t_nodeid); else snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.8", "OctetString", "text", "null"); if (event.time != null) snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.9", "OctetString", "text", event.time); else snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.9", "OctetString", "text", "null"); if (event.host != null) snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.10", "OctetString", "text", event.host); else snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.10", "OctetString", "text", "null"); t_interface = event.getInterface(); if (t_interface != null) snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.11", "OctetString", "text", t_interface); else snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.11", "OctetString", "text", "null"); if (event.snmphost != null) snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.12", "OctetString", "text", event.snmphost); else snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.12", "OctetString", "text", "forge.blast.com"); if (event.service != null) snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.13", "OctetString", "text", event.service); else snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.13", "OctetString", "text", "null"); if (event.descr != null) snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.16", "OctetString", "text", event.descr); else snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.16", "OctetString", "text", "null"); if (event.severity != null) snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.18", "OctetString", "text", event.severity); else snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.18", "OctetString", "text", "null"); if (event.pathoutage != null) snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.19", "OctetString", "text", event.pathoutage); else snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.19", "OctetString", "text", "null"); if (event.operinstruct != null) snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.20", "OctetString", "text", event.operinstruct); else snmpTrapHelper.addVarBinding(trap, ".1.3.6.1.4.1.5813.2.20", "OctetString", "text", "null"); snmpTrapHelper.sendTrap("public", trap, "10.1.1.15", 162); } catch (e) { sw = new StringWriter(); pw = new PrintWriter(sw); e.printStackTrace(pw); log.debug(sw.toString()); } } </event-script>
This will send an newly defined OpenNMS trap with the important event information embedded as varbinds.
If you wanted to limit the forwarded OpenNMS events to
nodeLostService
and
nodeRegainedService
, you can add a
<uei>
tag:
<event-script language="beanshell"> <uei name="uei.opennms.org/nodes/nodeLostService"/> <uei name="uei.opennms.org/nodes/nodeRegainedService"/>
To the first part of the <event-script>
tag.
Hats off to Jim for this work.
MapsOkay, let's get this out in the open. I don't care for maps in network management. Yes, they are nice looking, but truely useful maps cannot be automated, and the manual process of generating maps takes more time than they are worth.
That said, my opinions don't mean much in this project (grin) and if someone is willing to put in some work and write solid code I am more than willing to accept it. Thus, Derek Glidden decided to go and write a mapping system for OpenNMS.
This will display the nodes as icons, and the current availability is displayed in color underneath it. You can view it in a tree mode or just as a list of icons, and the image will automatically refresh. The parenting relationships have to be manually set.
The image is built and displayed using Scalable Vector Graphics (SVG). I think this is a great decision, but the downside is that the only SVG viewer I was able to get to work was from Adobe for Internet Explorer on Windows. I was not able to get SVG to work with Mozilla or Safari (on Mac). Using the system on IE was very clean and fast.
There is the option to convert the SVG image to a PNG image. This is extremely processor intensive, takes a long time on a network of any size, and often fails. It is not recommended.
Scared yet? (grin)
For these reasons I am treating the current map implementation as contributed code (i.e. not supported). It is hoped, however, that Derek and others will work to make more improvements to the system.
Okay, to get started, read the map.disable
file
in $OPENNMS_HOME/etc
. You will need to copy this file
to map.enable
. This will add a "Map" menu item in the
WebUI.
You will also have to make some changes to the tomcat4
configuration. First, you need to set headless
equal to
true
, and second you should probably increase the
memory available to Tomcat (especially if you are trying to use the SVG to
PNG transcoder). OutOfMemory
exceptions in Tomcat are
indicative of a too small memory setting when trying to render the
map.
I know this whole things sounds a bit negative, but that is no reflection on Derek's work. He wrote very clean code and I like the architecture (SVG especially) that he came up with. The icons are cool, too. So hats off to Derek.
But I am bracing myself for the onslaught of questions like "Can I add a background?", "Can I change the icons based on systemOID?", and "Can I make submaps?". Patience, please.
Nice Little ThingsThe following little changes and improvements have been made:
Added RFC2325 to the data collection configuration
Added a "bits" report (to replace bytes) and made it the default report for KSC reports.
Added the ability to define a "null" filter (can speed up OpenNMS starting)
Added new Cisco and UCD-SNMP reports (Thanks Tony and Stuart)
Added new trap definitions for IBM and Intel
As we move toward the next stable release of OpenNMS, a number of bugs have been fixed, including:
Added a check to handle null terminated strings in traps (Thanks Dave W.)
Corrected issues with day/week/month/year buttons in WebUI on various browsers
Changed the open count in notifications to reflect those for the user instead of the system
Fixed a typo in mail.pl
in the
contrib
directory
Added a small fix to the HTTP and HTTPS monitors that could
cause a ClassCastException
(Thanks Jim)
Added an ORDER BY statement to insure that categories reflect the correct values.
Added code to explicitly close sockets in plugins and monitors (see the Known Issues below for Java 1.4.2)
Bug 708: Fixed issues with viewing events when nodes are deleted
Bug 715: Added security roles to web.xml
(Thanks DJ)
Bug 741: Fixed issues with the SNMP admin page and null
issnmpprimary
values.
Bug 748: Added code to catch rrdUpdate
exceptions that could cause false nodeDown
events
Bug 752: Fixed a bug that caused certain rules to match all events
2.2. Changes in OpenNMS 1.1.1 and Above
The following features were added in 1.1.1:
Trap HandlingSNMP Traps will now be associated with nodes if the IP address in the trap matches a known IP address in the database.
If the IP Address is not known, OpenNMS will generate a newSuspect event to attempt to discover the device. This behavior can be disabled in the trapd-configuration.xml file.
Added new trap definitions for Dell OpenManage, Foundry Networks and ADIC. Also added an updated mib2opennms program which improves the look of the output.
ReportsAdded a new custom reporting module which allows one to create and save custom performance reports. It is called the Key SNMP Custom (KSC) Reporting Tool.
Added buttons on the standard Performance and Response Time pages to allow the range to be changed between the last Day/Week/Month/Year.
Response TimeAdded the ability to collect response time on the following pollers: Citrix, FTP, HTTPS, IMAP, POP3, SMTP and TCP.
The RRAs for Response Time data are now part of the poller configuration file.
Web ImprovementsThere is now a Response Time link on the node and interface pages.
If a node or interface supports HTTP, there is now a link to that service.
Added a two minute refresh to the event listing page.
Other FeaturesAdded non-blocking I/O to the HTTPS service. Now all monitors and plug-ins should be non-blocking.
If you set the IP Address in a poll-outages calendar to "match-any" it will match all addresses in the poller package that uses that calendar.
Increased the size of the contactinfo field in the usersnotified table, and changed create.sql to make this easier.
Fixed BugsFixed numerous bugs, including 650 where "down" events could be written to the database after the corresponding "up" event. See the CHANGELOG for a full list.
Tomcat4For a variety of reasons, OpenNMS 1.1.1 and beyond will require Tomcat4 version 4.1.18 or higher.
2.3. Changes in OpenNMS 1.1.0 and Above
There were many changes to OpenNMS between 1.0 and 1.1. Here are a few listed by functional area.
Events and Event HandlingThe events and notifications part of OpenNMS saw the most changes
with 1.1.0. First, there was a new tag added to the
eventconf.xml
file called <event-file>. This
allows for external files to be included in the event
configuration.
Also, the order in which events appear is now strictly enforced.
When trying to match an event with an event definition, OpenNMS takes the
first match. The events in the eventconf.xml
are read
first, followed by the files identified by <event-file> tags (in the
order in which they are listed). In the configuration that ships with
OpenNMS, the file with the default events is loaded last. Be sure to add
any custom files before that one.
Prior to this release, the SNMP generic traps 0-5 (coldStart, warmStart, linkDown, etc.) were hard-coded. Now they must be defined (and that definition is included in the default events file), but this allows for generic traps other than type 6 to be configured differently for, say, different hosts.
Speaking of event files, over 2750 events were added out of the box, including those from vendors such as Cisco, HP and 3Com. Please let us know if anything is misconfigured or if we need to add some events.
The ability to configure events based on parameters (varbinds) was also added. This is best demonstrated with an example. In the new HP event definitions there is an event called hpicfFaultFinderTrap. It is defined as:
<event> <mask> <maskelement> <mename>id</mename> <mevalue>.1.3.6.1.4.1.11.2.14.12.1</mevalue> </maskelement> <maskelement> <mename>generic</mename> <mevalue>6</mevalue> </maskelement> <maskelement> <mename>specific</mename> <mevalue>5</mevalue> </maskelement> </mask> <uei>uei.opennms.org/vendor/HP/traps/hpicfFaultFinderTrap</uei> <event-label>HP-ICF-FAULT-FINDER-MIB defined trap event: hpicfFaultFinderTrap</event-label> <descr> <p>This notification is sent whenever the Fault Finder creates an entry in the hpicfFfLogTable.</p> <table> <tr> <td><b>hpicfFfLogFaultType</b></td> <td>%parm[#1]%</td> <td><p> badDriver(1) badXcvr(2) badCable(3) tooLongCable(4) overBandwidth(5) bcastStorm(6) partition(7) misconfiguredSQE(8) polarityReversal(9) networkLoop(10) lossOfLink(11) portSecurityViolation(12) backupLinkTransition(13) meshingFault(14) fanFault(15) rpsFault(16) stuck10MbFault(17) lossOfStackMember(18) hotSwapReboot(19) </p></td> </tr> <tr> <td><b> hpicfFfLogAction</b></td> <td>%parm[#2]% </td> <td><p;> none(1) warn(2) warnAndDisable(3) warnAndSpeedReduce(4) warnAndSpeedReduceAndDisable(5) </p></td;> </tr> <tr> <td><b>hpicfFfLogSeverity</b></td> <td>%parm[#3]%</td> <td><p> informational(1) medium(2) critical(3) </p></td;> </tr> <tr> <td><b> hpicfFfFaultInfoURL</b></td> <td>%parm[#4]%</td> <td><p;></p></td;> </tr> </table> </descr> <logmsg dest='logndisplay'><p>HP Event: ICF Hub Fault Found.</p></logmsg> <severity>Warning</severity> </event>
Note that the third parameter denotes the severity of the event. By default this event has a severity of Warning, but what if it was desired to make the "critical" event a severity of Major? Using the new varbind extension to the mask tag:
<event> <mask> <maskelement> <mename>id</mename> <mevalue>.1.3.6.1.4.1.11.2.14.12.1</mevalue> </maskelement> <maskelement> <mename>generic</mename> <mevalue>6</mevalue> </maskelement> <maskelement> <mename>specific</mename> <mevalue>5</mevalue> </maskelement> <varbind> <vbnumber>specific</vbnumber> <vbvalue>5</vbvalue> </varbind> </mask> <uei>uei.opennms.org/vendor/HP/traps/hpicfFaultFinderTrap</uei> <event-label>HP-ICF-FAULT-FINDER-MIB defined trap event: hpicfFaultFinderTrap</event-label> <descr> <p>This notification is sent whenever the Fault Finder creates an entry in the hpicfFfLogTable.</p> <table> <tr> <td><b>hpicfFfLogFaultType</b></td> <td>%parm[#1]%</td> <td><p> badDriver(1) badXcvr(2) badCable(3) tooLongCable(4) overBandwidth(5) bcastStorm(6) partition(7) misconfiguredSQE(8) polarityReversal(9) networkLoop(10) lossOfLink(11) portSecurityViolation(12) backupLinkTransition(13) meshingFault(14) fanFault(15) rpsFault(16) stuck10MbFault(17) lossOfStackMember(18) hotSwapReboot(19)</p></td> </tr> <tr> <td><b>hpicfFfLogAction</b></td> <td>%parm[#2]%</td> <td><p> none(1) warn(2) warnAndDisable(3) warnAndSpeedReduce(4) warnAndSpeedReduceAndDisable(5)</p></td> </tr> <tr> <td><b>hpicfFfLogSeverity</b></td> <td>%parm[#3]%</td> <td><p;> informational(1) medium(2) critical(3)</p></td;> </tr> <tr> <td><b>hpicfFfFaultInfoURL</b></td> <td>%parm[#4]%</td> <td><p;></p></td;> </tr> </table> </descr> <logmsg dest='logndisplay'><p>HP Event: ICF Hub Fault Found.</p></logmsg> <severity>Major</severity> </event>
This event, when added before the previous event since it is more specific, will try to match on the enterprise id, the generic trap value of 6, the specific trap value of 5 and the value of the third parameter, or varbind, of 3.
There was also the addition of a low and high threshold rearm events. When a threshold is exceeded in consecutive polls equal to the trigger number, the threshold event is generated. Another event will not be generated until the polled value drops below the rearm number. The rearm event is thus similar to a "cleared" event. Since the first parameter passed with the threshold event is the data source name, using the "varbind" tag above, each data source can now have its own event.
One of the more noticeable changes is that the Unique Event Identifier no longer contains "http://". The original intent was that the UEI would act something like an XML namespace, but in practice it is just a label, so the "http://" was removed to avoid confusion.
Notifications also received some attention with this release. Due to popular demand, the tags %nodelabel% and %interfaceresolve% are now available. The former will display the label of the nodeid associated with the event, and the latter will attempt to resolve the name associated with the IP Address of the interface of the event.
In notifd-configuration.xml
there are now two
new attributes. In the global properties, there is "match-all". By
default, this is set to false, which means that the first notification
that matches an event will be the only notification sent. If it is set to
true, then all notifications that match a given event will be sent.
(Thanks Nick) In the auto-acknowledge section, there is a new attribute
called "clear". By adding "clear=true" to the auto-acknowledge tag, both
the event being auto acknowledged and the event that
caused the acknowledgement will be acknowledged. Thus the "up" event that
clears a "down" will also be cleared.
In addition to these enhancements, various bugs were fixed. Notification rules now actually work, and you can filter node level events via IP address. Also, threshold events can now generate notifications.
PollingThe biggest change to polling would have to be the addition of response time information for DHCP, DNS, HTTP and ICMP based pollers. Similar to data collection, the response time information can be graphed and it can have threshold alarms placed on it.
Also, all of the plugins and monitors (except HTTPS) have been re-written to use the non-blocking I/O available in the 1.4 JDK.
DiscoveryThere has been some discussion on how OpenNMS determines node labels. Currently, this is set to the resolved SNMP Primary Interface IP Address. However, it is common practice on routers to have a software-loopback address. OpenNMS will now discover such interfaces (as long as they do not have an address that starts with 127) and mark them as the primary SNMP Interface. Note that no services will be polled on such interfaces.
The Web User InterfaceA few changes were made to the WebUI. There is now a webui-colors.xml file that will allow for dynamic changes to the background colors used in the categories list on the main page (more pages to follow). Also under "Admin" the ability to delete nodes was added.
In addition, there is a new Admin page that will allow one to choose
which non-IP interfaces will be used in data collection. By setting the
snmpStorageFlag in datacollection-config.xml
to
"select" (now the default), OpenNMS will only store data from those
interfaces that could serve as a primary SNMP interface. One can then
select which other interfaces to collect on using the GUI. The previous
values of snmpStorageFlag ("primary" and "all") still work.
Also, the "Destination Path" interface now has the ability to choose NOT to include a service (thanks Nick) which will create a rule like "match the events where service is NOT FTP", and by placing the mouse over the categories on the main page, the last time the category was updated should be displayed.
Fixed Service Deletion in Downtime ModelThe poller downtime model allows for a service to be deleted if it has been down for a certain amount of time. This did not work correctly and has been fixed.
Reduced the Amount of Data Initially Collected from the ifTableDuring discovery, the ifTable is collected from each device that is found to support SNMP. On some HP switches, this would fail due to a limitation on the SNMP maximum packet size. All non-essential ifTable elements were removed from the request that appears to resolve the problem.
Removed Spaces in Notification Path NamesSpaces in Notification Path names have been known to cause problems. The Web UI was modified to disallow spaces in path names. Bug 657.
Fixed the AM/PM Ordering on Performance Report UI.In the Custom Performance Report Web UI, 11 PM was followed by 12 PM, when it should have been 12 AM. This has been corrected. Bug 515.
Added a "contrib" DirectoryThe "contrib" directory now contains code, such as nifty utilities, that exists outside of the main OpenNMS source but may prove useful. One such example is Tomas Carlsson's "mib2opennms" program. These programs are not supported.
Removed Duplicate Entries incapsd-configuration.xml
Both LDAP and Citrix protocol plug-ins were listed twice. This would slow down the capabilities scan considerably.
Updated Data Collection and GraphingAdded new entries to datacollection-config.xml
and snmp-graph.properties
.
Many bugfixes, including allowing Threshold events to generate notifications, AdminStatus and OperStatus values causing exceptions, and rescans with certain devices.
2.4. Changes in OpenNMS 1.0.0 and Above
The following major changes occurred between 0.9.9 and 1.0.0:
OpenSSH service is now "SSH"The OpenSSH service has been renamed to "SSH" and changed to detect common versions of SSH servers other than OpenSSH. Upgrades will retain the "OpenSSH" service as well for the sake of reports.
"Service Unresponsive" supportThere is now the possibility of having a state between "up" and "down" that flags a service as being unresponsive. This state can be reached when the service's port can be connected to, but it doesn't respond in a reasonable amount of time.
BugfixesMany small bugfixes, including the "Calculating..." problem if RTC hasn't come up yet when tomcat starts.
3. Known Issues and Caveats |
Known Problems And Workarounds In This Release |
Here is the list of known issues in this release of OpenNMS.
3.2. New Requirements for Tomcat and PostgreSQL
Version 1.1.2 and beyond of OpenNMS will require at a minimum Tomcat version 4.1.18 and PostgreSQL 7.2. OpenNMS will no longer supply "onms" versions of these applications, and instead will use main distributions from their maintainers.
Note that upgrading these programs is not simple.
Tomcat4For Tomcat, the best thing to do is uninstall version 4.0 and then install 4.1. Version 4.1 is not seen as an upgrade to 4.0, but is instead seen as a separate product by rpm and apt. You will need to make the following changes to the tomcat4.conf file (located in /etc/tomcat4 on Red Hat):
# you could also override JAVA_HOME here # Where your java installation lives # JAVA_HOME="/usr/java/jdk" # JAVA_HOME="/opt/IBMJava2-131" JAVA_HOME="[location of your Java Home dir" # What user should run tomcat TOMCAT_USER="root"
You do not have to run Tomcat as root if you change the permissions
on $OPENNMS/logs
and
$OPENNMS/etc
so that the Tomcat user can write to
them.
Two main changes need to be made to the Postgres configuration in order to allow OpenNMS to access it properly. Postgres needs to have been started at least once to create the "data" directory that will contain the configuration files.
Edit postgresql.conf
(located in
/var/pgsql/data
on Red Hat) and insure the following
values exist:
tcpip_socket = true max_connections = 256 shared_buffers = 1024
Edit pg_hba.conf
(host based authentication) to
allow all users to access the database from the local host by
un-commenting:
# TYPE DATABASE USER IP-ADDRESS IP-MASK METHOD local all all trust host all all 127.0.0.1 255.255.255.255 trust
and you may need to uncomment:
# Using sockets credentials for improved security. Not available everywhere, # but works on Linux, *BSD (and probably some others) # local all all ident sameuser
Note that this opens up Postgres to all users on the system (as long as they know the database password). Contact your database administrator if you want to limit this to a specific user, like root.
4. Supported Systems |
Supported UNIX-like OSes |
OpenNMS is written almost entirely in Java, and should be able to run on any system that suuports the Java 1.4 Virtual Machine. There are requirements for other programs such as PostgreSQL, Perl, RRDTool and Tomcat4, but the 1.4 JDK is the key requirement (as most of the other packages can be compiled from source).
The following are the systems that support or are known to run OpenNMS.
4.1. Fully Supported
The following Linux distributions and other unix-like systems are supported out-of-the-box with native installation packages.
Red Hat Linux 7.x, Red Hat Linux 8, Red Hat Linux 9 and Red Hat Enterprise Linux 3PostgreSQL 7.2 and later has shipped with Red Hat Linux since version 7.3. Be sure to follow the above instructions.
Fedora Core 1 and Fedora Core 2
OpenNMS is known to build and run on Fedora Core 1 and Fedora Core 2.
Debian Woody on Intel
Debian packages should be available on ftp.opennms.org, and at the following apt-repository:
deb http://debian.opennms.org/apt debian/opennms stable
Special Tomcat 4.1 packages were created, since only 4.0 is supported in stable. For instructions see this How-To written by Ian MacDonald.
Solaris 8 and Solaris 9 on x86 and SPARC
Packages are available at ftp.opennms.org for Solaris 8 and Solaris 9 running on SPARC and Solaris 9 running on x86.
Mandrake 8, Mandrake 9 and Mandrake 10
Please note that while we build packages for Mandrake 8.x, we do not do any formal testing on it. Packages are provided as a convenience.
SuSE 8 and SuSE 9
OpenNMS is known to build and run on SuSE.
MacOSX 10.2
On MacOSX, the Fink distribution packages of OpenNMS are supported. See the Fink web site for more information on installing and using Fink.
Also note that on MacOSX, PostgreSQL must be configured in the same manner as above for Linux. However, to do so you will need to update the SHM settings so that the OS allows enough resources for PostgreSQL to run with larger buffers.
To do so, you must edit /System/Library/StartupItems/SystemTuning/SystemTuning so that the sysctl lines look like so (at a minimum):
sysctl -w kern.sysv.shmmax=16777216 sysctl -w kern.sysv.shmmin=1 sysctl -w kern.sysv.shmmni=128 sysctl -w kern.sysv.shmseg=32 sysctl -w kern.sysv.shmall=4096
4.2. Unsupported
The following UNIX systems are unsupported, but have been known to work.
Debian Woody on SPARC
OpenNMS is known to run on Debian on SPARC
Red Hat 6.2
No special cases known.
Solaris 7
There have been reports of OpenNMS building and running on Solaris on the discuss list.