Halcyon PrimeAlert (R) ScriptRunner Script Pack
                              Version 1.0.2
                  For use with Sun Management Center

                                 README
-----------------------------------------------------------------------

Included Components
===================

This distribution includes a single PrimeAlert product which augments
the functionality of PrimeAlert ScriptRunner:

PrimeAlert ScriptRunner Script Pack


Nomenclature
============

- 'Sun Management Center' was formerly called 'Sun Enterprise SyMON'.
  This product only supports Sun Management Center 2.1 or higher.


Supported Platforms
===================

The following table summarizes the platform support for the various
components contained within this distribution:

Operating          Base               Agent  Server  Console
System             Product            Layer  Layer   Layer
-----------------  -----------------  -----  ------  -------
Solaris 2.5.1-2.8  SunMC 2.1.x/3.0      x


Informational Text Files
========================

A number of useful informational text files are included with this
distribution:

  - README
  - README.install
  - ReleaseNotes
  - LICENSE
  - COPYRIGHT

These files are located in the base directory of the distribution
tarball, and after installation, in the directory:

  /var/opt/SUNWsymon/install/HALUtilityScripts


Installation and Uninstallation Instructions
============================================

For installation or uninstallation instructions, refer to the file
README.install (see the above section 'Informational Text Files').


Troubleshooting
===============

The release notes file, ReleaseNotes (see the above section
'Informational Text Files') contains information about bug fixes,
enhancements, known problems, and upgrade strategy for the current
release.


HTML Help Documentation
=======================

To view the Help documentation, open the following file in the "doc"
subdirectory of the HALsuxscr_1.0.2 directory in the tar file
HALsuxscr_1.0.2.tar:

./doc/UtilityScripts/HALUtilityScripts-h.html


License
=======

This is FREE software. Please read the LICENSE file located in this
directory for more details.

If you wish to purchase a license to use PrimeAlert ScriptRunner,
please contact us at:

  Halcyon Monitoring Solutions, Inc.
  2300 Yonge Street
  Suite 1801, Box 2419
  Toronto, Ontario  M4P 1E4
  Canada

  http://www.HalcyonInc.com
  mailto:info@HalcyonInc.com
  Tel: 416-932-4647
  Fax: 416-932-4711


Overview
========

This software contains several scripts for use with PrimeAlert
ScriptRunner. These are useful scripts that monitor important services
and files, and issue alerts about problems. The scripts also server
as good examples to demonstrate the integration of scripts using
PrimeAlert ScriptRunner.

The scripts are installed in /var/opt/SUNWsymon/bin.

Some of the scripts require 'expect'. Version 5.26.0 of expect is also
included in this package.

 
Integrating a Script with PrimeAlert ScriptRunner
=================================================

Each script header contains its usage, instructions and an example of
how to run it using PrimeAlert ScriptRunner.

To run the script using PrimeAlert ScriptRunner, follow these steps:

1. Select "Load Module..." from the Module Menu of the Host Details
   window (also available from the host's Pop-up Menu).

2. Select PrimeAlert ScriptRunner from the Module Picklist and click on
   the OK button. This will launch the Module Loader which contains the
   following fields:

Instance
              Enter a short, alphanumeric description that has not been
              used previously for other instances of PrimeAlert
              ScriptRunner currently loaded on the same agent. This
              unique description is used to differentiate the various
              instances of PrimeAlert ScriptRunner loaded on the same
              agent. The description you enter here is appended to the
              internal Module name described above.

              For example, if the description "HALHTTPPing" is entered
              by the user, this instance is appended to the internal
              module name "HALScriptRunner" creating the following
              unique instance name:

              HALScriptRunner+HALHTTPPing

              This instance name is useful in specifying alarm actions
              using PrimeAlert EventAction.

Description
              Enter a short, formal description of the module. This
              description is used to identify the module in the console
              window.

              The short description you enter in this field will be
              appended with the Module Name described above.

Script Name
              Enter the name of the script, with a pathname relative to
              "/var/opt/SUNWsymon/bin". 

              For security purposes, this script must be in the
              directories mentioned, and it must be owned and
              executable by user "root".

Script
Parameters
              Enter the parameters (if any) to be passed to the script.

Script
Schedule
              The script can be executed periodically. An example for
              this is:

                  Cycle(5 Minutes)

              To limit the period during which the script is executed,
              use a time interval. An example is:

                  Hour >= 8 && Hour <= 17

              You may use the Time Editor to create an advanced
              periodic expression by clicking on the "Advanced..."
              button. The following value for Script Schedule could be
              used to execute the script every hour on weekends:

                  Day_of_week >= Saturday &&
                  Day_of_week <= Sunday &&
                  Cycle(1 hour)

3. After you have entered the required information, click on the OK
   button to load the module.

The module does not contain any scan patterns when it is initially
loaded. To load a scan pattern, follow these steps:

1. Right-click on the header of the Pattern Table to access the Pop-up
   Field Menu.

2. Select the "Add Row" menu option. This will launch an Add Row Editor
   in which you are required to enter the following parameters:

   Name:        Enter an instance name of the pattern being loaded. This
                instance name is used to distinguish one pattern from
                another when specifying multiple patterns. Use a short,
                unique, alphanumeric name.

   Description: Enter a short description to be used to identify the
                pattern in the Scan Results section.

   Pattern:     Enter the pattern to be used to scan the script's
                standard output and standard error. This pattern may be
                expressed using regular expressions. For a list of
                recommended patterns, see the scripts headers. The user
                should run the script on the command line to verify the
                output produced by the script before adding a pattern
                to watch.

You may launch an Attribute Editor to set Alarm Limits on each
individual pattern. To do this, follow these steps:

1. On the Matches Table, right click on the Current or Total column of
   the appropriate row.

2. Select Attribute Editor from the pop-up menu.

3. Select the Alarms tab.

4. Enter the threshold value(s) desired. For a list of recommended
   alarm thresholds, refer to the scripts headers.

The following are some general recommendations while setting alarms,
which apply to all scripts:

    * If an alarm should result when a script emits a message, and
      should cease when the script stops emitting the message, then set
      an alarm threshold on the CURRENT matches column.
      (Even when the alarm ceases, a history of alarms can still be
      viewed from the alarms tab by selecting 'Show All')

    * If an alarm should result when a script emits a message, and
      should persist until matches are explicitly reset, then set an
      alarm threshold on the TOTAL matches column.

    * If an alarm should result when a script emits a message, and
      should downgrade, but not cease, when the script stops emitting
      the message, then set an alarm threshold on the CURRENT matches
      column and a less severe alarm threshold on the TOTAL matches
      column (e.g. an 'alert' threshold on CURRENT and a 'caution'
      threshold on TOTAL)

    * If a message is informational only and should not result in an
      alarm, then set no alarm thresholds for the row.

    * Often the threshold set will be '>0' (that is, any occurrence of
      a message triggers an alarm) but it may be useful sometimes to
      set higher thresholds (so that more than one occurrence is'
      required to trigger an alarm)
 
A brief summary of each script follows:

1) HALConfigCheck.sh
   -----------------
This script audits for changes to key system configuration files (such
as passwd, hosts etc.) and alerts if there are any changes.

Usage:  HALConfigCheck.sh <configfile> 

where <configfile> is a configuration file located in
/var/opt/SUNWsymon/cfg.
The parameter must be specified relative to the directory
/var/opt/SUNWsymon/cfg. 

The <configfile> must specify the <configroot> work directory as
follows:
    CONFIGROOT=<full path to configroot directory>

The <configfile> may optionally specify the backup expiration period as 
follows:
    EXPIRE=<number of days>

If the expiration period is not specified, the value defaults to 20
days.

The <configfile> contents must be in Bourne shell format since it is 
executed by this script. Key-values must be specified as Bourne shell 
variables using '=' with no spaces (e.g. KEY=value). Comments can be 
included by starting the line with a '#'.

The contents of an example <configfile> are shown below:

    # Sample configuration file for HALConfigCheck.sh
    CONFIGROOT=/export1/HALConfigCheck
    EXPIRE=5

The <configroot> directory named in the <configfile> must exist and 
contain a file called 'Files' which lists the files (with full paths) 
to be monitored. The filenames can be separated using newlines or 
whitespaces. The contents of an example of a <configroot>/Files are 
shown below:

    # Sample 'Files' for HALConfigCheck.sh
    /etc/passwd
    /etc/group
    /etc/shadow

The <configroot> directory must be writeable by superuser of the host 
on which the agent is running (note that superuser typically has 
restricted write privileges on NFS mounted filesystems).

The <configroot> directory must have enough free disk space to 
accommodate a copy of each monitored file and as well as the compressed 
backups. The required disk space shall depend on the size of the 
monitored files, the frequency at which the files are checked, the 
frequency at which the files change, and the backup expiration period. 

Current copies of the monitored file are stored in the directory
<configroot>/mirror.

Backups are saved into the <configroot>/save directory. The backup 
filenames are based on the date and time (i.e. YYYYMMDDHHMMSS.tar.Z) 
when the backup was created.

Old backups expire after <expire> days (default is 20 days).

Script Output:

* If a specified file does not exist:
       "Doesn't exist: <file>"
* If a specified file is not readable:
       "Not readable: <file>"
* If a specified file is created:
       "Created: <file>"
* If a specified file is deleted:
       "Deleted: <file>"
* If a specified file is changed:
       "Changed: <file>"
* If a specified file is a directory:
       "File doesn't exist: <file> is a directory"
* If any file is changed or deleted:
       "Backup saved to: <file>"
* If an old backup is expired:
       "Expired: <file>"
* If there is a configuration error:
       "Error: <error message>

Possible PrimeAlert ScriptRunner instance:

* Load parameters:
      Script to Execute: HALConfigCheck.sh
      Script Arguments: HALConfigCheck.dat
      Execution schedule: cycle(1 hour)

* Patterns:
      "exist:"        caution on CURRENT > 0
      "Not readable:" caution on CURRENT > 0
      "Created:"      (no alarm)
      "Deleted:"      critical on TOTAL > 0
      "Changed:"      alert on TOTAL > 0
      "Expired:"      (no alarm)  
      "saved to:"     (no alarm)
      "Error:"        critical on TOTAL > 0


2) HALSecurityCheck.sh
   -------------------
This script checks for basic security exposures such as blank passwords,
'+' in .rhosts, hosts.equiv etc. It checks all local users (i.e. users
in /etc/passwd, including root). Optionally, it also checks users listed
in NIS/NIS+ databases.

Warning messages are issued when a problem is detected. The problem
details are logged to a logfile accessible to the superuser only.

Usage:  HALSecurityCheck.sh [ -local ]

The "-local" option suppresses checking of the NIS/NIS+ databases.

Problems are logged to a circular logfile named 
/var/opt/SUNWsymon/log/HALSecurityCheck.log

To view the contents of the circular logfile, login as superuser and
run the command:

"/opt/SUNWsymon/sbin/es-run ccat /var/opt/SUNWsymon/log/HALSecurityCheck.log"

A list of hosts which are 'trusted' should be placed in a file named
/var/opt/SUNWsymon/cfg/hosts.trust; any rhost entries naming other than
these hosts are considered suspicious.

Output messages:
        "Info: <log file problem>"
        "Error: <configuration error message>"
        "Security warning: check log file"

Possible PrimeAlert ScriptRunner instance:

* Load parameters:
       Script to Execute: HALSecurityCheck.sh
       Script Arguments: -local
       Execution Schedule: cycle(1 day)

* Patterns:
       "Security" critical on CURRENT > 0
       "Error" critical on CURRENT > 0
       "Info" caution on CURRENT > 0

3) HALDNSPing.sh
   -------------
This script verifies that a user-specified DNS server is responding to
requests.

Usage:  HALDNSPing.sh <DNS server> <DNS name to resolve>
 
Output messages:
        See also nslookup(1)
* If server host is unknown:
       "*** Can't find server address for '<DNS server>': Unknown host"
* If server is not responding:
       "*** Can't find server name for address '<DNS IP>': No response from server"
* If name cannot be resolved:
       "*** <DNS server> can't find <DNS name to resolve>: Non-existent host/domain"

Possible PrimeAlert ScriptRunner instance:

* Load parameters:
       Script to Execute: HALDNSPing.sh
       Script Arguments: ns1 www.halcyoninc.com
       Execution Schedule: cycle(15 minutes)

* Patterns:
       "[Cc]an't find"  alert on CURRENT > 0


4) HALFTPPing.sh
   -------------
This script monitors the availability of an FTP service. It contacts a
user-specified FTP server and attempts an anonymous login.

Usage:  HALFTPPing.sh <host>

Output messages:
* Unable to resolve hostname:
       "Failed: Unknown host <host>"
* Unable to connect:
       "Failed: Connection to <host>:21 refused" or
       "Failed: Unable to connect to <host>:21" or
       "Failed: Connection to <host>:21 timed out"
* Unexpected connection close or timeout during session:
       "Failed: Didn't receive greeting" or
       "Failed: No response to USER anonymous" or
       "Failed: No response to PASS command"
* Server-returned error response:
       "Failed (<error code>): <error message>"
* Successful session:
       "Success: <host> FTP is alive."
  possibly preceded by:
       "Warning: <host> didn't close connection properly."

Possible PrimeAlert ScriptRunner instance: 

* Load parameters:
       Script to Execute: HALFTPPing.sh
       Script Arguments: ftp
       Execution Schedule: cycle(15 minutes)

* Patterns:
       "Failed.*:" alert on CURRENT > 0
       "Warning:" caution on CURRENT > 0
       "Success:" (no alarm)

 
5) HALHTTPPing.sh
   --------------
This script monitors the availability of a Web service. It contacts a
user-specified HTTP server (which can be a proxy server) and
requests a user-specified page. The request is retried a certain number
of times if it times out.

Usage:  HALHTTPPing.sh [ -port <port> ] [ -timeout <timeout> ] 
                  [ -retries <retries> ] <host> <page>

To monitor http://host/page, it is recommended to not use a proxy if
possible. In this case, specify the host and page as is, i.e.
    HALHTTPPing.sh host /page

To use a proxy host, specify page as the full URL, and host as the proxy
host, i.e.
    HALHTTPPing.sh proxy http://host/page

The default timeout is 20 seconds, and the default retry is 3 times.
 
Output messages:
* Unable to resolve host:
       "Failed: Unknown host <host>"
* Unable to connect:
       "Failed: Connection to <host>:<port> refused" or
       "Failed: Unable to connection to <host>:<port>"
* Timeout after maximum number of retries
       "Failed: Timed out after <n> attempts."
* Invalid response from server
       "Failed: No HTTP response header received."
* Server response received:
       "Error (<code>): <message>" or
       "Redirection (<code>): <message> <newlocation>" or
       "Success (<code>): <message>"

Possible PrimeAlert ScriptRunner instance:

* Load parameters:
       Script to Execute: HALHTTPPing.sh
       Script Arguments: www.halcyoninc.com /symon/
       Execution Schedule: cycle(15 minutes)

* Patterns:
       "Failed:" alert on CURRENT > 0
       "Error.*:" alert on CURRENT > 0
       "Redirection.*:" caution on CURRENT > 0
       "Success.*:" (no alarms)

6) HALNNTPPing.sh
   --------------
This script monitors the availability of a news service. It contacts a
user-specified NNTP server and attempts to query the control newsgroup.
The control newsgroup is a special newsgroup on which news system
software post cryptic Usenet control messages for other news system
software.

Usage:  HALNNTPPing.sh <host>
 
Output messages:
* Unable to resolve hostname:
       "Failed: Unknown host <host>"
* Unable to connect:
       "Failed: Connection to <host>:119 refused" or
       "Failed: Unable to connect to <host>:119" or
       "Failed: Connection to <host>:119 timed out"
* Unexpected connection close or timeout during session:
       "Failed: Didn't receive greeting" or
       "Failed: No response to 'group control'" 
* Server-returned error response:
       "Failed (<error code>): <error message>"
* Successful session:
       "Success: <host> NNTP is alive."
  possibly preceded by:
       "Warning: <host> didn't close connection properly."

Possible PrimeAlert ScriptRunner instance: 

* Load parameters:
       Script to Execute: HALNNTPPing.sh
       Script Arguments: nntp
       Execution Schedule: cycle(15 minutes)

* Patterns:
       "Failed.*:" alert on CURRENT > 0
       "Warning:" caution on CURRENT > 0
       "Success:" (no alarm)

7) HALIMAPPing.sh
   --------------
This script monitors the availability of an IMAP mail retrieval service.
It contacts a user-specified IMAP server and attempts a login.

Usage:  HALIMAPPing.sh <host>
 
Output messages:
* Unable to resolve hostname:
      "Failed: Unknown host <host>"
* Unable to connect:
      "Failed: Connection to <host>:143 refused" or
      "Failed: Unable to connect to <host>:143" or
      "Failed: Connection to <host>:143 timed out"
* Unexpected connection close or timeout:
      "Failed: Didn't receive greeting"
* If IMAP server aborts connection:
      "Failed: IMAP aborted connection: <message>"
* If bogus login succeeds:
      "Warning: bogus login succeeded."
* Successful session:
      "Success: IMAP service is alive."
  possibly preceded by
      "Warning: IMAP didn't close connection properly."

Possible PrimeAlert ScriptRunner instance:

* Load parameters:
      Script to Execute: HALIMAPPing.sh
      Script Arguments: mail
      Execution schedule: cycle(15 minutes)

* Patterns:
      "Failed:" alert on CURRENT > 0
      "Warning:" caution on CURRENT > 0
      "Success:" (no alarm)

8) HALPOPPing.sh
   -------------
This script monitors the availability of a mail retrieval service, using
the POP protocol.

Usage:  HALPOPPing.sh <host>
 
Output messages:
* Unable to resolve hostname:
      "Failed: Unknown host <host>"
* Unable to connect:
      "Failed: Connection to <host>:110 refused" or
      "Failed: Unable to connect to <host>:110" or
      "Failed: Connection to <host>:110 timed out"
* Unexpected connection close or timeout:
      "Failed: Didn't receive greeting" or
      "Failed: No response to login"
* If bogus login succeeds:
      "Warning: bogus login succeeded."
* Successful session:
      "Success: POP service is alive."
  possibly preceded by
      "Warning: service didn't close connection properly."

Possible PrimeAlert ScriptRunner instance:

* Load parameters:
      Script to Execute: HALPOPPing.sh
      Script Arguments: mail
      Execution schedule: cycle(15 minutes)

* Patterns:
      "Failed:" alert on CURRENT > 0
      "Warning:" caution on CURRENT > 0
      "Success:" (no alarm)

9) HALSMTPPing.sh
   --------------
This script monitors the availability of an SMTP mail delivery service.
It contacts a user-specified SMTP server, attempts to identify itself,
and quits without sending a message.

Usage:  HALSMTPPing.sh <host>
 
Output messages:
* Unable to resolve hostname:
       "Failed: Unknown host <host>"
* Unable to connect:
       "Failed: Connection to <host>:25 refused" or
       "Failed: Unable to connect to <host>:25" or
       "Failed: Connection to <host>:25 timed out"
* Unexpected connection close or timeout during session:
       "Failed: Didn't receive greeting" or
       "Failed: No response to HELLO" 
* Server-returned error response:
       "Failed (<error code>): <error message>"
* Successful session:
       "Success: <host> SMTP is alive."
  possibly preceded by:
       "Warning: <host> didn't close connection properly."

Possible PrimeAlert ScriptRunner instance: 

* Load parameters:
       Script to Execute: HALSMTPPing.sh
       Script Arguments: smtp
       Execution Schedule: cycle(15 minutes)

* Patterns:
       "Failed.*:" alert on CURRENT > 0
       "Warning:" caution on CURRENT > 0
       "Success:" (no alarm)

10) HALModemTest.sh
    ---------------    
This script monitors the availability of a dial-up login service. It
uses a local tip-configured modem to dial a user-specified number, and
attempts a UNIX login.

The dial-up retries a certain number of times if there is a busy signal.

Usage:  HALModemTest.sh [-r <retries>] [-w <wait>] <modem> <phone number>

<modem> must be a device name ready for use by the tip command e.g.
/dev/cua/a.

The default number of retries is 3 and the default wait time between
successive retries is 3 minutes.

Output messages:
* Unable to dial-up:
       "Failed: Couldn't connect to <modem>" or
       "Failed: <modem> not responding" or
       "Failed: <phone number> modem service not responding" or
       "Failed: NO CARRIER for <phone number>"
* Busy signal:
       "Busy: <phone number> modem is busy."
        ( script retries anyway )
       "Failed: <phone number> modem is still busy after <n> attempts."
* Login sequence failure:
       "Failed: Didn't receive login prompt"
       "Failed: Didn't receive password prompt"
       "Failed: Didn't receive 'login incorrect'"
* Success:
       "Success: <phone number> modem service is alive"

Possible PrimeAlert ScriptRunner instance:

* Load parameters:
       Script to execute: HALModemTest.sh
       Script Arguments: /dev/cua/a 9,5551212 
       Execution Schedule: cycle(1 day)

* Patterns:
       "Failed:" Alert on CURRENT > 0
       "Busy:" Caution on CURRENT > 1
       "Success:" (no alarm)

11) HALTelnetPing.sh
    ---------------- 
This script monitors the availability of a telnet login service. It
contacts a user-specified telnet server and attempts a UNIX login.

Usage:  HALTelnetPing.sh <host>

Output messages:
* Unable to resolve hostname:
      "Failed: Unknown host <host>"
* Unable to connect:
      "Failed: Connection to <host>:23 refused" or
      "Failed: Unable to connect to <host>:23" or
      "Failed: Connection to <host>:23 timed out"
* Unexpected connection close or timeout:
      "Failed: Didn't receive login prompt" or
      "Failed: Didn't receive password prompt" or
      "Failed: Didn't receive 'login incorrect'"
* Successful session:
      "Success: telnet service is alive."

Possible PrimeAlert ScriptRunner instance:

* Load parameters:
      Script to Execute: HALTelnetPing.sh
      Script Arguments: localhost
      Execution schedule: cycle(15 minutes)

* Patterns:
      "Failed:" alert on CURRENT > 0
      "Success:" (no alarm)

---//---