Halcyon PrimeAlert (R) ScriptRunner Script Pack
Version 1.0.2
For use with Sun Management Center
README
-----------------------------------------------------------------------
Included Components
===================
This distribution includes a single PrimeAlert product which augments
the functionality of PrimeAlert ScriptRunner:
PrimeAlert ScriptRunner Script Pack
Nomenclature
============
- 'Sun Management Center' was formerly called 'Sun Enterprise SyMON'.
This product only supports Sun Management Center 2.1 or higher.
Supported Platforms
===================
The following table summarizes the platform support for the various
components contained within this distribution:
Operating Base Agent Server Console
System Product Layer Layer Layer
----------------- ----------------- ----- ------ -------
Solaris 2.5.1-2.8 SunMC 2.1.x/3.0 x
Informational Text Files
========================
A number of useful informational text files are included with this
distribution:
- README
- README.install
- ReleaseNotes
- LICENSE
- COPYRIGHT
These files are located in the base directory of the distribution
tarball, and after installation, in the directory:
/var/opt/SUNWsymon/install/HALUtilityScripts
Installation and Uninstallation Instructions
============================================
For installation or uninstallation instructions, refer to the file
README.install (see the above section 'Informational Text Files').
Troubleshooting
===============
The release notes file, ReleaseNotes (see the above section
'Informational Text Files') contains information about bug fixes,
enhancements, known problems, and upgrade strategy for the current
release.
HTML Help Documentation
=======================
To view the Help documentation, open the following file in the "doc"
subdirectory of the HALsuxscr_1.0.2 directory in the tar file
HALsuxscr_1.0.2.tar:
./doc/UtilityScripts/HALUtilityScripts-h.html
License
=======
This is FREE software. Please read the LICENSE file located in this
directory for more details.
If you wish to purchase a license to use PrimeAlert ScriptRunner,
please contact us at:
Halcyon Monitoring Solutions, Inc.
2300 Yonge Street
Suite 1801, Box 2419
Toronto, Ontario M4P 1E4
Canada
http://www.HalcyonInc.com
mailto:info@HalcyonInc.com
Tel: 416-932-4647
Fax: 416-932-4711
Overview
========
This software contains several scripts for use with PrimeAlert
ScriptRunner. These are useful scripts that monitor important services
and files, and issue alerts about problems. The scripts also server
as good examples to demonstrate the integration of scripts using
PrimeAlert ScriptRunner.
The scripts are installed in /var/opt/SUNWsymon/bin.
Some of the scripts require 'expect'. Version 5.26.0 of expect is also
included in this package.
Integrating a Script with PrimeAlert ScriptRunner
=================================================
Each script header contains its usage, instructions and an example of
how to run it using PrimeAlert ScriptRunner.
To run the script using PrimeAlert ScriptRunner, follow these steps:
1. Select "Load Module..." from the Module Menu of the Host Details
window (also available from the host's Pop-up Menu).
2. Select PrimeAlert ScriptRunner from the Module Picklist and click on
the OK button. This will launch the Module Loader which contains the
following fields:
Instance
Enter a short, alphanumeric description that has not been
used previously for other instances of PrimeAlert
ScriptRunner currently loaded on the same agent. This
unique description is used to differentiate the various
instances of PrimeAlert ScriptRunner loaded on the same
agent. The description you enter here is appended to the
internal Module name described above.
For example, if the description "HALHTTPPing" is entered
by the user, this instance is appended to the internal
module name "HALScriptRunner" creating the following
unique instance name:
HALScriptRunner+HALHTTPPing
This instance name is useful in specifying alarm actions
using PrimeAlert EventAction.
Description
Enter a short, formal description of the module. This
description is used to identify the module in the console
window.
The short description you enter in this field will be
appended with the Module Name described above.
Script Name
Enter the name of the script, with a pathname relative to
"/var/opt/SUNWsymon/bin".
For security purposes, this script must be in the
directories mentioned, and it must be owned and
executable by user "root".
Script
Parameters
Enter the parameters (if any) to be passed to the script.
Script
Schedule
The script can be executed periodically. An example for
this is:
Cycle(5 Minutes)
To limit the period during which the script is executed,
use a time interval. An example is:
Hour >= 8 && Hour <= 17
You may use the Time Editor to create an advanced
periodic expression by clicking on the "Advanced..."
button. The following value for Script Schedule could be
used to execute the script every hour on weekends:
Day_of_week >= Saturday &&
Day_of_week <= Sunday &&
Cycle(1 hour)
3. After you have entered the required information, click on the OK
button to load the module.
The module does not contain any scan patterns when it is initially
loaded. To load a scan pattern, follow these steps:
1. Right-click on the header of the Pattern Table to access the Pop-up
Field Menu.
2. Select the "Add Row" menu option. This will launch an Add Row Editor
in which you are required to enter the following parameters:
Name: Enter an instance name of the pattern being loaded. This
instance name is used to distinguish one pattern from
another when specifying multiple patterns. Use a short,
unique, alphanumeric name.
Description: Enter a short description to be used to identify the
pattern in the Scan Results section.
Pattern: Enter the pattern to be used to scan the script's
standard output and standard error. This pattern may be
expressed using regular expressions. For a list of
recommended patterns, see the scripts headers. The user
should run the script on the command line to verify the
output produced by the script before adding a pattern
to watch.
You may launch an Attribute Editor to set Alarm Limits on each
individual pattern. To do this, follow these steps:
1. On the Matches Table, right click on the Current or Total column of
the appropriate row.
2. Select Attribute Editor from the pop-up menu.
3. Select the Alarms tab.
4. Enter the threshold value(s) desired. For a list of recommended
alarm thresholds, refer to the scripts headers.
The following are some general recommendations while setting alarms,
which apply to all scripts:
* If an alarm should result when a script emits a message, and
should cease when the script stops emitting the message, then set
an alarm threshold on the CURRENT matches column.
(Even when the alarm ceases, a history of alarms can still be
viewed from the alarms tab by selecting 'Show All')
* If an alarm should result when a script emits a message, and
should persist until matches are explicitly reset, then set an
alarm threshold on the TOTAL matches column.
* If an alarm should result when a script emits a message, and
should downgrade, but not cease, when the script stops emitting
the message, then set an alarm threshold on the CURRENT matches
column and a less severe alarm threshold on the TOTAL matches
column (e.g. an 'alert' threshold on CURRENT and a 'caution'
threshold on TOTAL)
* If a message is informational only and should not result in an
alarm, then set no alarm thresholds for the row.
* Often the threshold set will be '>0' (that is, any occurrence of
a message triggers an alarm) but it may be useful sometimes to
set higher thresholds (so that more than one occurrence is'
required to trigger an alarm)
A brief summary of each script follows:
1) HALConfigCheck.sh
-----------------
This script audits for changes to key system configuration files (such
as passwd, hosts etc.) and alerts if there are any changes.
Usage: HALConfigCheck.sh <configfile>
where <configfile> is a configuration file located in
/var/opt/SUNWsymon/cfg.
The parameter must be specified relative to the directory
/var/opt/SUNWsymon/cfg.
The <configfile> must specify the <configroot> work directory as
follows:
CONFIGROOT=<full path to configroot directory>
The <configfile> may optionally specify the backup expiration period as
follows:
EXPIRE=<number of days>
If the expiration period is not specified, the value defaults to 20
days.
The <configfile> contents must be in Bourne shell format since it is
executed by this script. Key-values must be specified as Bourne shell
variables using '=' with no spaces (e.g. KEY=value). Comments can be
included by starting the line with a '#'.
The contents of an example <configfile> are shown below:
# Sample configuration file for HALConfigCheck.sh
CONFIGROOT=/export1/HALConfigCheck
EXPIRE=5
The <configroot> directory named in the <configfile> must exist and
contain a file called 'Files' which lists the files (with full paths)
to be monitored. The filenames can be separated using newlines or
whitespaces. The contents of an example of a <configroot>/Files are
shown below:
# Sample 'Files' for HALConfigCheck.sh
/etc/passwd
/etc/group
/etc/shadow
The <configroot> directory must be writeable by superuser of the host
on which the agent is running (note that superuser typically has
restricted write privileges on NFS mounted filesystems).
The <configroot> directory must have enough free disk space to
accommodate a copy of each monitored file and as well as the compressed
backups. The required disk space shall depend on the size of the
monitored files, the frequency at which the files are checked, the
frequency at which the files change, and the backup expiration period.
Current copies of the monitored file are stored in the directory
<configroot>/mirror.
Backups are saved into the <configroot>/save directory. The backup
filenames are based on the date and time (i.e. YYYYMMDDHHMMSS.tar.Z)
when the backup was created.
Old backups expire after <expire> days (default is 20 days).
Script Output:
* If a specified file does not exist:
"Doesn't exist: <file>"
* If a specified file is not readable:
"Not readable: <file>"
* If a specified file is created:
"Created: <file>"
* If a specified file is deleted:
"Deleted: <file>"
* If a specified file is changed:
"Changed: <file>"
* If a specified file is a directory:
"File doesn't exist: <file> is a directory"
* If any file is changed or deleted:
"Backup saved to: <file>"
* If an old backup is expired:
"Expired: <file>"
* If there is a configuration error:
"Error: <error message>
Possible PrimeAlert ScriptRunner instance:
* Load parameters:
Script to Execute: HALConfigCheck.sh
Script Arguments: HALConfigCheck.dat
Execution schedule: cycle(1 hour)
* Patterns:
"exist:" caution on CURRENT > 0
"Not readable:" caution on CURRENT > 0
"Created:" (no alarm)
"Deleted:" critical on TOTAL > 0
"Changed:" alert on TOTAL > 0
"Expired:" (no alarm)
"saved to:" (no alarm)
"Error:" critical on TOTAL > 0
2) HALSecurityCheck.sh
-------------------
This script checks for basic security exposures such as blank passwords,
'+' in .rhosts, hosts.equiv etc. It checks all local users (i.e. users
in /etc/passwd, including root). Optionally, it also checks users listed
in NIS/NIS+ databases.
Warning messages are issued when a problem is detected. The problem
details are logged to a logfile accessible to the superuser only.
Usage: HALSecurityCheck.sh [ -local ]
The "-local" option suppresses checking of the NIS/NIS+ databases.
Problems are logged to a circular logfile named
/var/opt/SUNWsymon/log/HALSecurityCheck.log
To view the contents of the circular logfile, login as superuser and
run the command:
"/opt/SUNWsymon/sbin/es-run ccat /var/opt/SUNWsymon/log/HALSecurityCheck.log"
A list of hosts which are 'trusted' should be placed in a file named
/var/opt/SUNWsymon/cfg/hosts.trust; any rhost entries naming other than
these hosts are considered suspicious.
Output messages:
"Info: <log file problem>"
"Error: <configuration error message>"
"Security warning: check log file"
Possible PrimeAlert ScriptRunner instance:
* Load parameters:
Script to Execute: HALSecurityCheck.sh
Script Arguments: -local
Execution Schedule: cycle(1 day)
* Patterns:
"Security" critical on CURRENT > 0
"Error" critical on CURRENT > 0
"Info" caution on CURRENT > 0
3) HALDNSPing.sh
-------------
This script verifies that a user-specified DNS server is responding to
requests.
Usage: HALDNSPing.sh <DNS server> <DNS name to resolve>
Output messages:
See also nslookup(1)
* If server host is unknown:
"*** Can't find server address for '<DNS server>': Unknown host"
* If server is not responding:
"*** Can't find server name for address '<DNS IP>': No response from server"
* If name cannot be resolved:
"*** <DNS server> can't find <DNS name to resolve>: Non-existent host/domain"
Possible PrimeAlert ScriptRunner instance:
* Load parameters:
Script to Execute: HALDNSPing.sh
Script Arguments: ns1 www.halcyoninc.com
Execution Schedule: cycle(15 minutes)
* Patterns:
"[Cc]an't find" alert on CURRENT > 0
4) HALFTPPing.sh
-------------
This script monitors the availability of an FTP service. It contacts a
user-specified FTP server and attempts an anonymous login.
Usage: HALFTPPing.sh <host>
Output messages:
* Unable to resolve hostname:
"Failed: Unknown host <host>"
* Unable to connect:
"Failed: Connection to <host>:21 refused" or
"Failed: Unable to connect to <host>:21" or
"Failed: Connection to <host>:21 timed out"
* Unexpected connection close or timeout during session:
"Failed: Didn't receive greeting" or
"Failed: No response to USER anonymous" or
"Failed: No response to PASS command"
* Server-returned error response:
"Failed (<error code>): <error message>"
* Successful session:
"Success: <host> FTP is alive."
possibly preceded by:
"Warning: <host> didn't close connection properly."
Possible PrimeAlert ScriptRunner instance:
* Load parameters:
Script to Execute: HALFTPPing.sh
Script Arguments: ftp
Execution Schedule: cycle(15 minutes)
* Patterns:
"Failed.*:" alert on CURRENT > 0
"Warning:" caution on CURRENT > 0
"Success:" (no alarm)
5) HALHTTPPing.sh
--------------
This script monitors the availability of a Web service. It contacts a
user-specified HTTP server (which can be a proxy server) and
requests a user-specified page. The request is retried a certain number
of times if it times out.
Usage: HALHTTPPing.sh [ -port <port> ] [ -timeout <timeout> ]
[ -retries <retries> ] <host> <page>
To monitor http://host/page, it is recommended to not use a proxy if
possible. In this case, specify the host and page as is, i.e.
HALHTTPPing.sh host /page
To use a proxy host, specify page as the full URL, and host as the proxy
host, i.e.
HALHTTPPing.sh proxy http://host/page
The default timeout is 20 seconds, and the default retry is 3 times.
Output messages:
* Unable to resolve host:
"Failed: Unknown host <host>"
* Unable to connect:
"Failed: Connection to <host>:<port> refused" or
"Failed: Unable to connection to <host>:<port>"
* Timeout after maximum number of retries
"Failed: Timed out after <n> attempts."
* Invalid response from server
"Failed: No HTTP response header received."
* Server response received:
"Error (<code>): <message>" or
"Redirection (<code>): <message> <newlocation>" or
"Success (<code>): <message>"
Possible PrimeAlert ScriptRunner instance:
* Load parameters:
Script to Execute: HALHTTPPing.sh
Script Arguments: www.halcyoninc.com /symon/
Execution Schedule: cycle(15 minutes)
* Patterns:
"Failed:" alert on CURRENT > 0
"Error.*:" alert on CURRENT > 0
"Redirection.*:" caution on CURRENT > 0
"Success.*:" (no alarms)
6) HALNNTPPing.sh
--------------
This script monitors the availability of a news service. It contacts a
user-specified NNTP server and attempts to query the control newsgroup.
The control newsgroup is a special newsgroup on which news system
software post cryptic Usenet control messages for other news system
software.
Usage: HALNNTPPing.sh <host>
Output messages:
* Unable to resolve hostname:
"Failed: Unknown host <host>"
* Unable to connect:
"Failed: Connection to <host>:119 refused" or
"Failed: Unable to connect to <host>:119" or
"Failed: Connection to <host>:119 timed out"
* Unexpected connection close or timeout during session:
"Failed: Didn't receive greeting" or
"Failed: No response to 'group control'"
* Server-returned error response:
"Failed (<error code>): <error message>"
* Successful session:
"Success: <host> NNTP is alive."
possibly preceded by:
"Warning: <host> didn't close connection properly."
Possible PrimeAlert ScriptRunner instance:
* Load parameters:
Script to Execute: HALNNTPPing.sh
Script Arguments: nntp
Execution Schedule: cycle(15 minutes)
* Patterns:
"Failed.*:" alert on CURRENT > 0
"Warning:" caution on CURRENT > 0
"Success:" (no alarm)
7) HALIMAPPing.sh
--------------
This script monitors the availability of an IMAP mail retrieval service.
It contacts a user-specified IMAP server and attempts a login.
Usage: HALIMAPPing.sh <host>
Output messages:
* Unable to resolve hostname:
"Failed: Unknown host <host>"
* Unable to connect:
"Failed: Connection to <host>:143 refused" or
"Failed: Unable to connect to <host>:143" or
"Failed: Connection to <host>:143 timed out"
* Unexpected connection close or timeout:
"Failed: Didn't receive greeting"
* If IMAP server aborts connection:
"Failed: IMAP aborted connection: <message>"
* If bogus login succeeds:
"Warning: bogus login succeeded."
* Successful session:
"Success: IMAP service is alive."
possibly preceded by
"Warning: IMAP didn't close connection properly."
Possible PrimeAlert ScriptRunner instance:
* Load parameters:
Script to Execute: HALIMAPPing.sh
Script Arguments: mail
Execution schedule: cycle(15 minutes)
* Patterns:
"Failed:" alert on CURRENT > 0
"Warning:" caution on CURRENT > 0
"Success:" (no alarm)
8) HALPOPPing.sh
-------------
This script monitors the availability of a mail retrieval service, using
the POP protocol.
Usage: HALPOPPing.sh <host>
Output messages:
* Unable to resolve hostname:
"Failed: Unknown host <host>"
* Unable to connect:
"Failed: Connection to <host>:110 refused" or
"Failed: Unable to connect to <host>:110" or
"Failed: Connection to <host>:110 timed out"
* Unexpected connection close or timeout:
"Failed: Didn't receive greeting" or
"Failed: No response to login"
* If bogus login succeeds:
"Warning: bogus login succeeded."
* Successful session:
"Success: POP service is alive."
possibly preceded by
"Warning: service didn't close connection properly."
Possible PrimeAlert ScriptRunner instance:
* Load parameters:
Script to Execute: HALPOPPing.sh
Script Arguments: mail
Execution schedule: cycle(15 minutes)
* Patterns:
"Failed:" alert on CURRENT > 0
"Warning:" caution on CURRENT > 0
"Success:" (no alarm)
9) HALSMTPPing.sh
--------------
This script monitors the availability of an SMTP mail delivery service.
It contacts a user-specified SMTP server, attempts to identify itself,
and quits without sending a message.
Usage: HALSMTPPing.sh <host>
Output messages:
* Unable to resolve hostname:
"Failed: Unknown host <host>"
* Unable to connect:
"Failed: Connection to <host>:25 refused" or
"Failed: Unable to connect to <host>:25" or
"Failed: Connection to <host>:25 timed out"
* Unexpected connection close or timeout during session:
"Failed: Didn't receive greeting" or
"Failed: No response to HELLO"
* Server-returned error response:
"Failed (<error code>): <error message>"
* Successful session:
"Success: <host> SMTP is alive."
possibly preceded by:
"Warning: <host> didn't close connection properly."
Possible PrimeAlert ScriptRunner instance:
* Load parameters:
Script to Execute: HALSMTPPing.sh
Script Arguments: smtp
Execution Schedule: cycle(15 minutes)
* Patterns:
"Failed.*:" alert on CURRENT > 0
"Warning:" caution on CURRENT > 0
"Success:" (no alarm)
10) HALModemTest.sh
---------------
This script monitors the availability of a dial-up login service. It
uses a local tip-configured modem to dial a user-specified number, and
attempts a UNIX login.
The dial-up retries a certain number of times if there is a busy signal.
Usage: HALModemTest.sh [-r <retries>] [-w <wait>] <modem> <phone number>
<modem> must be a device name ready for use by the tip command e.g.
/dev/cua/a.
The default number of retries is 3 and the default wait time between
successive retries is 3 minutes.
Output messages:
* Unable to dial-up:
"Failed: Couldn't connect to <modem>" or
"Failed: <modem> not responding" or
"Failed: <phone number> modem service not responding" or
"Failed: NO CARRIER for <phone number>"
* Busy signal:
"Busy: <phone number> modem is busy."
( script retries anyway )
"Failed: <phone number> modem is still busy after <n> attempts."
* Login sequence failure:
"Failed: Didn't receive login prompt"
"Failed: Didn't receive password prompt"
"Failed: Didn't receive 'login incorrect'"
* Success:
"Success: <phone number> modem service is alive"
Possible PrimeAlert ScriptRunner instance:
* Load parameters:
Script to execute: HALModemTest.sh
Script Arguments: /dev/cua/a 9,5551212
Execution Schedule: cycle(1 day)
* Patterns:
"Failed:" Alert on CURRENT > 0
"Busy:" Caution on CURRENT > 1
"Success:" (no alarm)
11) HALTelnetPing.sh
----------------
This script monitors the availability of a telnet login service. It
contacts a user-specified telnet server and attempts a UNIX login.
Usage: HALTelnetPing.sh <host>
Output messages:
* Unable to resolve hostname:
"Failed: Unknown host <host>"
* Unable to connect:
"Failed: Connection to <host>:23 refused" or
"Failed: Unable to connect to <host>:23" or
"Failed: Connection to <host>:23 timed out"
* Unexpected connection close or timeout:
"Failed: Didn't receive login prompt" or
"Failed: Didn't receive password prompt" or
"Failed: Didn't receive 'login incorrect'"
* Successful session:
"Success: telnet service is alive."
Possible PrimeAlert ScriptRunner instance:
* Load parameters:
Script to Execute: HALTelnetPing.sh
Script Arguments: localhost
Execution schedule: cycle(15 minutes)
* Patterns:
"Failed:" alert on CURRENT > 0
"Success:" (no alarm)
---//---