NAME

monit - system for monitoring programs


SYNOPSIS

monit [options] {arguments}


DESCRIPTION

monit is a utility for monitoring and managing daemons or similar programs running on a Unix system; monit will start specified programs if they are not running and restart programs not responding.

The monit utility can run in a daemon mode to repeatedly poll one or more programs at a specified interval.


GENERAL OPERATION

The behavior of monit is controlled by command-line options and a run control file, ~/.monitrc, the syntax of which we describe in a later section. Command-line options override .monitrc declarations.

The following options are recognized by monit. It's recommended that you set the log and daemon options in the control .monitrc file.

General Options and Arguments

-c file Use this control file

-l logfile Print log information to this file.

-d n Run as a daemon once per n seconds

-I Run from init (do not run in background)

-g Set group name for start, stop, restart and status

-v Verbose mode, work noisy (diagnostic output)

-V Print version number and patchlevel

-h Print a help text

In addition to the options above, monit can be started with one of the following action arguments; monit will then execute the action and exit without transforming itself to a deamon.

start Start all programs listed in the control file. If the group option is set, only start the programs in the named group.

start name Start the named program. The name must exist in the monitrc file, after a check keyword. See also the MONIT HTTPD section below.

stop Stop all programs listed in the control file. If the group option is set, only stop the programs in the named group.

stop name Stop the named program. The name must exist in the monitrc file, after a check keyword. See also the MONIT HTTPD section below.

restart Stop and start all programs. If the group option is set, only restart the programs in the named group.

restart name Restart the named program. The name must exist in the monitrc file, after a check keyword. also the MONIT HTTPD section below.

status Print status information for each program. If the group option is set, only print the status for the named group.

quit Kill monit daemon process

validate Check all programs and start the ones not running. Also if a program indicates (in the control file) that it's listening on a port number, although monit cannot connect to the port, then restart the program. This action is also the default behavior when monit runs in daemon mode.


LOGGING

monit will log status and error messages to a log file. If syslog is given as a value for the -l option (or the keyword set logfile syslog is found in the control file) monit will use the syslog system daemon for logging messages. To turn off logging, simply do not set the logfile in the control file (and of course, do not use the -l switch)


DAEMON MODE

The -d interval option runs monit in daemon mode. You must specify a numeric argument which is a polling interval in seconds.

In daemon mode, monit puts itself in the background and runs continously, monitoring each specified program and then sleeping for the given polling interval.

       Simply invoking
              monit -d 300

will poll all programs described in your ~/.monitrc file every 5 minutes.

It is possible to set a polling interval in your ~/.monitrc file by saying 'set daemon n', where n is an integer number of seconds. If you do this, monit will always start in daemon mode (as long as no action arguments are given).

Only one daemon process is permitted per user; in daemon mode, monit makes a per-user lockfile to guarantee this.

Calling monit with a daemon in the background sends a wakeup signal to the daemon, forcing it to check programs immediately.

The quit argument will kill a running daemon process instead of waking it up.

If you touch or change the .monitrc file while monit is running in daemon mode, this will be detected at the beginning of the next poll cycle. When a changed .monitrc is detected, monit rereads it and reinitialize itself. Note also that if you break the .monitrc file's syntax, the monit daemon will exit after logging the appropriate error message.


INIT SUPPORT

Monit can be run and controlled from init. In the (unlikely) case of a monit crash init will respawn a new monit process.

You can use either the 'set init' statement in monit's configuration file or use the -I option from the command line. Here's a sample /etc/inittab entry for monit:

  # Run monit in standard runlevels
  mo:2345:respawn:/usr/local/sbin/monit -Ic /etc/monitrc

After you have modified init's configuration file, you can run the following command to re-examine /etc/inittab and start monit:

  telinit q 

For systems without telinit:
  kill -1 1

Make sure that if you run monit from init, that you do not start monit in your startup scripts as well.


GROUP SUPPORT

Program entries in the control file, monitrc, can be grouped together by the group statement. The syntax is simply (keyword in capital):

  GROUP groupname

With this statement it is possible to group similar program entries together and manage them as a whole. Monit provides functions to start, stop and restart a group of programs, like so:

To start a group of programs:

  monit -g <groupname> start

To stop a group of programs:

  monit -g <groupname> stop

To restart a group of programs:

  monit -g <groupname> restart

Show the status of a program group:

  monit -g <groupname> status


ALERT MESSAGES

monit will send an email alert if a program timed out, if monit restarted or stopped a program, a resource statement matches (see also the section RESOURCE TESTING below) or if a checksum error occurred (see also the section MD5 CHECKSUM below). More than one alert statement can be used in a process entry. This means that you can send different emails to different addresses. The full syntax for the alert statement is as follows: (keywords are in capital)

 ALERT mail-address [{events}] [MAIL-FORMAT {mail-format}]

Simply using:

 alert foo@bar

will send a default email alert to the address foo@bar whenever a timeout, restart, checksum, stop or resource error occurs.

If you only want an alert message sent when a certain event occurs for example a timeout or when a program is restarted; postfix the alert-statement respectively

 alert foo@bar only on { timeout } or
 alert foo@bar { timeout }

(only and on are noise keywords, ignored by monit)

or

 alert foo@bar { restart }

The same applies for a checksum error

 alert foo@bar { checksum }

It is also possible to combine events and send mail to different email addresses like:

 alert foo@bar { restart, timeout, resource } 
 alert security@bar on { checksum, stop }
 alert manager@bar

This will send an alert message to foo@bar when a timeout or restart occurs and a message to security@bar if a checksum error occurs. And finally, a message to manager@bar whenever any error event occurs.


The following alert-statement:

 alert foo@bar { timeout, restart, checksum, resource, stop }

is equivalent to:

 alert foo@bar

which as stated above, will send a message when a timeout, a restart or a checksum error occurs. (If the post fix variant is used, then note that the parenthesis are mandatory).

A restart alert is also sent if monit fails to execute the start or the stop program for an entry. It is therefor strongly advised that at least one alert statement register interest for restart alerts.

monit will provide a default mail message layout that is short and to the point. Here's an example of a standard alert mail sent by monit:

 From: monit@tildeslash.com
 Subject: monit alert -- apache restarted
 To: hauk@tildeslash.com
 Date: Tue, 28 May 2002 20:42:30 +0200
 Program apache restarted
        Date: Tue May 28 20:42:30 2002
        Host: www.tildeslash.com
 Your faithful employee,
 monit

If you want to, you can change the format of this message with the optional mail-format statement. The syntax for this statement is as follows:

 mail-format {
      from: monit@localhost
   subject: apache $EVENT at $DATE
   message: Monit restarted $PROGRAM at $DATE on $HOST. 
     Your joke for today is:
     Things You Do Not Want Your System Administrator to Say:
        * Ooops.
        * Wow!! Look at this ...
        * Hey!! The Suns don't do this.
        * Terminated??!
        * What software license?
        * Well, it's doing something ...
        * Wow! ... That seemed fast ...
        * Where's the DIR command?
        * Why is my "rm" taking so long?
        * System coming down in 0 min ...
 }

Where the keyword from: is the email address monit should pretend it is sending from. It does not have to be a real mail address, but must be a proper formated mail address, on the form; name@domain. The keyword subject: is for the email subject line. The subject must be on only one line. The message: keyword denotes the mail body. If used, this keyword should always be the last in a mail-format statement. The mail body can be as long as you want and must not contain the '}' character.

All of these format keywords are optional but you must provide at least one. Thus if you only want to change the from address monit is using you can do:

 alert foo@bar with mail-format { from: bofh@xyzzy.no }

From the previous example you will notice that 4 special variables was used. If used they will be substituted into the text with a special value:

$EVENT A string describing the event that occured. The values are fixed and are, ``restarted'', ``timed out'', ``stopped and ''checksum error``

$PROGRAM The program entry name in monitrc

$DATE The current time and date (C time style).

$HOST The name of the host monit is running on

Setting a global mail format

Finally, it is possible to set a standard mail format with the following global set-statement (keywords are in capital):


 SET MAIL-FORMAT {mail-format}

Format set with this statement will apply to every alert statement that does not have its own specified mail-format. This statement is most usefull for setting a default from address for messages sent by monit, like so:

 set mail-format { from: monit@foo.bar.no }


PROGRAM TIMEOUT

monit provides a program timeout mechanism for situations where a program simply refuses to start or respond over a longer period. In cases like this, and particularly if monits poll-cycle is low, monit will simply increase the machine load by trying to restart the program.

The timeout mechanism monit provides is based on two variables, i.e. the number the program has been started and the number of poll-cycles. For example, if a program had x restarts within y poll-cycles (where x <= y) then monit will timeout and not (re)start the program on the next cycle. It's a good idea to use the alert statement in conjunction with timeout, so if a timeout occurs monit will send an alert notification. A legal (but verbose) way to write a timeout statement for a program entry in the control file is:

 timeout if 3 restarts within 3 cycles

The shorthand version is:

 timeout(3,3)

Where the first digit is the number of program restarts, the second is the number of poll-cycles. If the number of cycles was reached without a timeout, the program start-counter is reset to zero. This provides some granularity to catch expectional cases and do a program timeout, but to let occasional program restarts happen without having an accumulated timeout.

If you use timeout (it's optional), then be sure to add an alert statement to notify the responsible administrator. Such as:


 timeout(3, 5) and alert bofh@foo.bar on { timeout }

To have monit check the program again after a timeout, run 'monit start program' from the command line. This will remove the timeout lock in the daemon and make the daemon start and check the program again.


RESOURCE TESTING

Monit can examine how much system resources a service or the system is using.

Depending on this indicators services can be stopped or restarted and alerts can be generated. Thus it is possible to utilize systems which are idle and to spare system under high load.

The full syntax for the resource-statements used for resource testing is as follows (keywords are in capital and optional statements in [brackets]),

 resource operator value [cycles] action

resource is a choice of ``CPUUSAGE'', ``MEMUSAGE'', ``MEMKBYTE'', ``LOADAVG([1min|5min|15min])'':

CPUUSAGE is the CPU usage of the process and it's children in parts of hundred (percent). This resource value is a floating point number. For instance, 60.0.

In case the system has more then one CPU and the process has child processes the CPU usage can raise above 100%.

MEMUSAGE is the memory usage of the process in parts of hundred (percent). This resource value is also a floating point number.

MEMKBYTE is the memory amount of the process in KiB (1024 byte). This resource value is an integer number.

LOADAVG([1min|5min|15min]) refers to the system's load average. The load average is the number of processes in the system run queue averaged over the specified time period. This resource value is again a floating point number.

operator is a choice of ``<'',``>'',``!='',``=='' in c notation, ``gt'', ``lt'', ``eq'', ``ne'' in shell sh notation and ``greater'', ``less'', ``equal'', ``notequal'' in human readable form.

cycles is the maximum number of cycles the expression above has to be true in order to start an action. If cylces is omitted then it is set to one.

action is a choice of ``ALERT'', ``RESTART'', ``STOP''.

ALERT sends the user a resource alert in case the maximum number of cycles has been reached.

RESTART restarts the service in case the maximum number of cycles has been reached.

STOP stops the service in case the maximum number of cycles has been reached. If monit stops a service it will not be checked by monit anymore nor restarted again later. You must explicit start it again from the web interface or from the console, like: 'monit start apache' if you want the monit daemon to monitor the service again.

To calculate the cycles a counter is raised whenever the expression above is true and it is lowered whenever it is false (but not below 0). All counters are reseted in case of a restart.

In order to check that the CPU usage of a service is not going beyond 50% for five cycles before restarting it, the following expression could be used:

 if cpuusage is greater than 50.0 for 5 cylces then restart

Or the short version without noise keywords:

 cpuusage > 50.0 5 restart

See also the example section below.


CONNECTION TESTING

Monit is able to perfom connection testing via networked ports and via unix sockets.

If a program listens on one or more sockets, monit can connect to the port (using either tcp or udp) and verify that the program will accept a connection and that it is possible to read and write to the socket. If a connection is not accepted or if there is a problem with the socket i|o, monit will assume that something is wrong and restart the program. The full syntax for the port-statement used for connection testing is as follows (keywords are in capital and optional statements in [brackets]) for networked ports,

 [HOST hostname] PORT number [TYPE {TCP|UDP}] [PROTO(COL) {name} 
  [REQUEST {"/path"}]]

or for unix sockets,

 UNIX(SOCKET) path [TYPE {TCP|UDP}] [PROTO(COL) {name} 
  [REQUEST {"/path"}]]

To have monit check a port connection use the following statement:

  port 80

In this case the machine in question is assumed to be localhost and monit will issue a tcp connection to localhost at port 80. Monit will use tcp by default, if you want to connect with udp, you can specify this after the port-statement;

 port 53 use type udp ('use' is a noise keyword)

In case a server is listening to a unix socket called /var/run/mysocket, the following statement can be used:

 unix /var/run/mysocket

If your machine answers for several virtual hosts you can prefix the port statement with a host-statement like so:

 host www.sol.no     port 80
 host shop.sol.no    port 443
 host kvasir.sol.no  port 80
 host 10.2.3.4       port 80

And as mentioned above, if you do not specify a host-statement, localhost is assumed.

Finally, monit also knows how to speak some of the more popular Internet protocols. So, besides testing for connections, monit can also speak with the server in question to verify that the server works. For example, the following is used to test a http server:


 host www.tildeslash.com port 80 protocol http

At the moment monit knows how to speak HTTP, SMPT, FTP, POP, IMAP and NNTP.

Some protocols also support a request statement. This statement can be used to ask the server for a special document entity.

Currently only the HTTP protocol module supports the request statement, such as:

 host www.myhost.com port 80 protocol http 
   request "/data/show.php?a=b&c=d"

The request should contain an URI string specifying a document from the http server. The string will be url encoded by monit before it sends the request to the http server, so it's okay to use url unsafe characters in the request.

If the request statement isn't specified, the default web server page will be requested.

It is of course possible to mix networked ports and unix sockets checks for a service.

See also the example section below.


MONIT HTTPD

If specified in the control file, monit will start a monit daemon with http support. From a Browser you can then start and stop programs as well as view the status of each program. Also, if monit logs to its own file, you can view the content of this logfile from a Browser.

The control file statement for starting a monit daemon with http support is a global set-statement:

  set httpd port 2812

And you can use this URL, http://localhost:2812/, to access the daemon from a browser.

The port number, in this case 2812, can be any number that you are allowed to bind to.

If you only want the http server to accept connect requests to one host addresses you can specify the bind address either as an IP number string or as a hostname. In this example we bind the http server to the loopback device. This means that the http server will only be reachable from localhost:

  set httpd port 2812 and use the address 127.0.0.1

or

  set httpd port 2812 and use the address localhost

If you do not use the ADDRESS statement the http server will accept connections on any/all local addresses.

If you remove the httpd statement from the config file, monit will stop the httpd server on its next cycle. Likewise if you change the port number, monit will restart the http server using the new specified port number.

The status page displayed by monit is automatically refreshed with the same poll time set for the monit daemon.

Note:

You must start a monit daemon with http support if you want to be able to use the following console commands.

 'monit stop'
 'monit start program' 
 'monit stop program' 
 'monit restart program' 
 'monit -g groupname start' 
 'monit -g groupname stop' 
 'monit -g groupname restart'

If a monit daemon is running in the background we will ask the deamon (via the HTTP protocol) to execute the above commands. That is, the daemon is requested to start and stop the programs. This ensures that a daemon will not restart a program that you requested to stop and that (any) timeout lock will be removed from a program when you start it.

Monit HTTPD Authentication

monit supports two types of autenthication schemas for connecting to the httpd server. Both schemas can be used together or by itself. You must choose at least one.

Host allow list

The http server maintains an access-control list of hosts allowed to connect to the server. You can add as many hosts as you want to, but only hosts with a valid domain name or its IP address are allowed. If you specify a host that does not resolve, monit will write an error message in the console and not start.

The http server will query a nameserver to check any hosts connecting to the server. If a host (client) is trying to connect to the server, but cannot be found in the access list or cannot be resolved, the server will shutdown the connection to the client promptly.

Control file example:

  set httpd port 2812
      allow localhost
      allow my.other.work.machine.com
      allow 10.1.1.1

Basic Authentication

This authentication schema is HTTP specific and described in more detail in RFC 2617.

In short; a server challenge a client (e.g. a Browser) to send authentication information (username and password) and if accepted, the server will allow the client access to the requested document.

The biggest weakness with Basic Authentication is that the username and password is sent in clear-text (i.e. base64 encoded) over the network. It is therefor recommended that you do not use this authentication method unless you are behind a firewall.

monit will use Basic Authentication if an allow statement contains a username and password separated with a single ':' character, like so; allow username:password. The username and password must be written in clear-text. Only one username and password pair is supported.

If you use this method together with a host list, then only clients from the listed hosts will be allowed to connect to the monit http server and each client will be asked to provide a username and password.

Example:

  set httpd port 2812
      allow localhost
      allow my.other.work.machine.com
      allow 10.1.1.1
      allow hauk:monit

If you only want to use Basic Authentication, then just provide the one line with username and password, like:

  set httpd port 2812
      allow hauk:monit

If you use Basic Authentication it is a good idea to set the access permission for the control file (~/.monitrc) to only readable and writeable for the user running monit, because the password is written in clear-text. (Use this command, /bin/chmod 600 ~/.monitrc). This is actually a good idea anyway.


MD5 CHECKSUM

If specified in the control file, monit will compute a md5 checksum for programs. The checksum is used to verify that a program does not change. If a program was changed, monit will send an (optional) alert notification, log an alert message and not check the process anymore. The web interface will also show a checksum warning.

The rationale for this feature is security and that monit does not start a possible cracked program or script.

The full syntax for the checksum-statement is as follows: (keywords are in capital)

 CHECKSUM [file [EXPECT checksum] ]+

A legal (but verbose) way to write a checksum statement for a process entry in the control file is:

 checksum the /usr/bin/httpd program

The shorthand version is just:

 checksum /usr/bin/httpd

Several files can be used in a checksum statement:

 checksum /usr/apache/bin/httpd /usr/apache-ssl/bin/httpsd

or on a line by itself:

 checksum /usr/apache/bin/httpd
 checksum /usr/apache-ssl/bin/httpsd

You can add as many 'checksum file' statements as you want. Like described above, if the checksum for a file changes, monit will log a warning, issue an alert message and not check the associated process anymore.

The expect statement is optional and used to specify a md5 string monit should expect when testing a file's checksum. If this statement is used monit will not compute an initial checksum for the file, as in the examples above, but instead use the string you submit. For example:

 checksum /usr/bin/httpd expect 8f7f419955cefa0b33a2ba316cba3659

or verbose style;

 checksum /usr/bin/httpd and 
   expect the sum 4e5309d1956f003bcdff168748bea647

You can, for example, use the GNU utility md5sum to create a checksum string for a file and then use this string in the expect-statement.


THE RUN CONTROL FILE

The preferred way to set up monit is to write a .monitrc file in your home directory. When there is a conflict between the command-line arguments and the arguments in this file, the command-line arguments take precedence. To protect the security of your control file and passwords the control file must have permissions no more than 0700 (u=xrw,g=,o=); monit will complain and exit otherwise.

Run Control Syntax

Comments begin with a '#' and extend through the end of the line. Otherwise the file consists of a series of program entries or global option statements in a free-format, token-oriented syntax.

There are three kinds of tokens: grammar keywords, numbers (i.e. decimal digit sequences) and strings. Strings can be either quoted or unquoted. A quoted string is bounded by double quotes and may contain whitespace (and quoted digits are treated as a string). An unquoted string is any whitespace-delimited token, containing characters and/or numbers.

Each program entry consists of the keywords `check', followed by a unique descriptive name for the program, which is again followed by a path to the program's pidfile. A check entry can have a number of optional statements. These statements are described below and in the example section.

You can use noise keywords like 'if', `and', `with(in)', `has', `using', 'use', 'on(ly)' and `program' anywhere in an entry to make it resemble English. They're ignored, but can make entries much easier to read at a glance. The punctuation characters ';' ',' and '=' are also ignored. Keywords are case insensitive.

 Here are the legal global keywords:
 Keyword         Function
 -----------------------------------------------------------
 set daemon      Set a background poll interval in seconds
 set init        Set monit to run from init
 set logfile     Name of a file to dump error- and status-
                 messages to. If syslog is specified as the 
                 file, monit will utilize the syslog daemon
                 to log messages.
 set mailserver  The mailserver used for sending alert
                 notifications. If the mailserver is not 
                 defined, monit will try to use 'localhost' 
                 as the smtp-server for sending mail.
 set mail-format Set a global mail format for all alert
                 messages emitted by monit.
 set httpd port  Activates monit http server at the given 
                 portnumber.
 address         If specified, the http server will only 
                 accept connect requests to this addresses
                 This statement is an optional part of the
                 set httpd statement.
 allow           Specifies a host or IP address allowed to
                 connect to the http server. Can also specify
                 a username and password allowed to connect
                 to the server. More than one allow statement
                 are allowed. This statement is also an 
                 optional part of the set httpd statement.
 Here are the legal program entry keywords:
 Keyword         Function
 ------------------------------------------------------------
 check           Starts an entry and must be followed by a 
                 descriptive name for the program.
 pidfile         Specify the  programs pidfile. Every 
                 program must create a pidfile with its 
                 current process id
 group           Specify a groupname for a program entry.
 start           The program for starting the specified 
                 process. Full path is required. This 
                 statement is optional.
 stop            The program for stopping the specified 
                 process -- full path is required. This 
                 statement is optional.
 host            The hostname or IP address to test the port
                 at. This keyword can only be used together
                 with a port statement.
 port            Specify a TCP/IP service port number which 
                 the program is listening on. This statement
                 is also optional. If this statement is not
                 prefixed with a host-statement, localhost is
                 used as the hostname to test the port at.
 type            Specifies the type of socket monit should 
                 use when testing a connection to the port.
                 If the type keyword is omitted, tcp is 
                 used. This keyword must be followed by 
                 either tcp or udp.
 tcp             Specifies that monit should use a TCP 
                 socket type (stream) when testing the port.
 udp             Specifies that monit should use a UDP socket
                 type (datagram) when testing the port.
 proto(col)      This keyword specifies the type of service 
                 found at the port. monit knows at the moment 
                 how to speak HTTP, SMPT, FTP, POP and IMAP. 
                 You're welcome to write new protocol test 
                 modules. If no protocol is specified monit 
                 will use a default test which in most cases 
                 are good enough.
 request         Specifies a server request and must come
                 after the protocol keyword mentioned above.
                  - for http it can contain an URI and an
                    optional query string.
                  - other protocols doesn't support this
                    statement yet
 unix(socket)    Specifies a unix socket file and used like 
                 the port statement above to test a Unix 
                 domain network socket connection.
 timeout         Define program timeout.  Must be followed by
                 two digits. The first digit is max number of
                 restarts for  the program.  The second digit
                 is the cycle interval to test restarts. 
                 This statement is optional
 alert           Specifies an email address for notification
                 if checksum, timeout, stop or restart occurs.
                 Alert can also be postfixed, to only send a
                 message for certain events. See the examples
                 above. More than one alert statement is allowed
                 in an entry. This statement is also optional.
 mail-format     Specifies a mail format for an alert message 
                 This statement is an optional part of the
                 alert statement.
 checksum        Specify that monit should verify a checksum
                 for associated files.
                 More than one checksum statement are allowed
 expect          Specifies a checksum string (md5) monit 
                 should use when testing the checksum. This
                 statement is an optional part of the 
                 checksum statement.
 every           Validate this entry only at every n poll 
                 cycle. Usefull in daemon mode when the
                 poll-cycle is short and the program takes
                 some time to start. 
 autostart       Must be followed by the keywords yes or no. 
                 If yes, monit will restart the program if 
                 it is not running (the default behaviour).
 cpuusage        Must be followed by a compare operator, a 
                 floating point number, optionally a maximum
                 number of cycles and an action. This statement
                 is used to check the cpu usage in percent of a
                 process with it's children over a number of
                 cylces.  If the compare expression matches then
                 the action restart, alert or stop is activated
 memusage        The equivalent to cpuusage for memory of a 
                 process (w/o children!). The syntax is the same
                 as above.
 memkbyte        The equivalent to memusage but with amounts 
                 in Kb instead of percentages.
 loadavg         Must be followed by [1min,5min,15min] in (), a 
                 compare operator, a floating point number,
                 optionally a maximum number of cycles and an
                 action.  This statement is used to check the
                 system load average over a number of cylces. If
                 the compare expression matches then the action 
                 start, alert or stop is avtivated.

Here's the complete list of reserved keywords used by monit:

set, daemon, logfile, syslog, address, httpd, allow, check, init, pidfile, group, start, stop, port(number), unix(socket), type, proto(col), tcp, udp, alert, mail-format, restart, timeout, checksum, resource, expect, mailserver, every, autostart, yes, no, host, default, http, ftp, smtp, pop, nntp, imap, request, cpuusage, memusage, memkbyte and loadavg.

And here is a complete list of noise keywords ignored by monit:

if, is, are, on(ly), with(in), and, has, using, use, the, sum, restarts, program(s), cycle(s), than, then, for.

Note: If the start or stop programs are shell scripts, then the script must begin with #! and the remainder of the first line must specify an interpreter for the program. E.g. #!/bin/sh

It's possible to write scripts directly into the start and stop entries by using a string of shell-commands. Like:

 start: "/bin/sh -c { echo $$ > pidfile; exec program }"
 stop:  "/bin/sh -c { kill -s SIGTERM `cat pidfile` }"

CONFIGURATION EXAMPLES

The simplest form is just the check statement. In this example we check to see if the server is running and log a message if not:

 check resin with pidfile /usr/local/resin/srun.pid

To have monit start the server if it's not running, add a start statement:

 check resin with pidfile /usr/local/resin/srun.pid
   start program = "/usr/local/resin/bin/srun.sh start"

Here's a more advanced example for monitoring an apache web-server listening on the default portnumber for HTTP and HTTPS. In this example monit will restart apache if it's not accepting connections at the portnumbers. The method monit use for a process restart is to first execute the stop-program, wait for 10 seconds (to give the program time to terminate) and then execute the start-program.

 check apache with pidfile /var/run/httpd.pid
   start program = "/etc/init.d/httpd start"
   stop program  = "/etc/init.d/httpd stop"
   port 80   
   port 443

In this example we use udp for connection testing to check if the 'named' service is running and also use timeout and alert:

 check named with pidfile /var/run/named.pid
   start program = "/etc/init.d/named start"
   stop program  = "/etc/init.d/named stop"
   port 53 use type udp
   timeout (3,5) 
   alert bofh@norid.no

The following example illustrate how to check if the service 'sophie' is answering connections on its unix domain socket:

 check sophie with pidfile /var/run/sophie.pid
   start program = "/etc/init.d/sophie start"
   stop  program = "/etc/init.d/sophie stop"
   unix /var/run/sophie

In this example we check an apache web-server running on localhost that answers for several IP-based virtual hosts or vhosts, hence the host statement before port:

 check apache with pidfile /var/run/httpd.pid
   start program = "/etc/init.d/httpd start"
   stop program  = "/etc/init.d/httpd stop"
   host www.sol.no          port 80
   host shop.sol.no         port 443
   host chat.sol.no         port 80
   host www.tildeslash.com  port 80

In the following example we ask monit to compute and verify the checksum for the underlying apache binary used by the start and stop programs.

 check apache with pidfile /var/run/httpd.pid
   start program = "/etc/init.d/httpd start"
   stop program  = "/etc/init.d/httpd stop"
   host www.tildeslash.com  port 80
   checksum /usr/local/apache/bin/httpd

Some servers are slow starters, like for example Java based Application Servers. So if we want to keep the poll-cycle low (i.e. < 60 seconds) but allow some programs to take its time to start, the every statement is handy:

 check dynamo with pidfile /etc/dynamo.pid
   start program = "/etc/init.d/dynamo start"
   stop program  = "/etc/init.d/dynamo stop"
   port 8840
   every 2 cycle

Here is an example where we group together two database entries. The autostart statement is also illustrated in the first entry and have the effect that monit will not try to (re)start this program if it is not running:

 check sybase with pidfile /var/run/sybase.pid
   start program = "/etc/init.d/sybase start"
   stop program  = "/etc/init.d/sybase stop"
   autostart no
   group database
 check oracle with pidfile /var/run/oracle.pid
   start program = "/etc/init.d/oracle start"
   stop program  = "/etc/init.d/oracle stop"
   autostart yes # Not necessary really, since this is the default
   port 9001
   alert bofh@foo.bar
   group database

Here is an example to show the usage of the resource checks. It will send an alert when the CPU usage of the http daemon and it's child processes raises beyond 60% for over two cycles. It is restarted when the CPU usage it over 80% for five cycles, the memory usage over 100MiBfor five cycles or the load average is beyond 10 for 8 cycles:

 check apache with pidfile /var/run/httpd.pid
   start program = "/etc/init.d/httpd start"
   stop program  = "/etc/init.d/httpd stop"
   if cpuusage > 60.0 for 2 cycles then alert
   if cpuusage > 80.0 for 5 cycles then restart
   if memkbyte > 100000 for 5 cycles then stop
   if loadavg(5min) greater than 10.0 for 8 cycles then stop

In this example we demonstrate usage of the extended alert statement:

 check apache with pidfile /var/run/httpd.pid
   start program = "/etc/init.d/httpd start"
   stop program  = "/etc/init.d/httpd stop"
   host www.tildeslash.com  port 80
   checksum /usr/local/apache/bin/httpd
   alert security@bar on {checksum}
   alert admin@bar on {restart, timeout} 
     with mail-format { 
              from:     bofh@$HOST
              subject:  apache $EVENT
              message:  This event occurred on $HOST at $DATE.
              Your faithful employee,
              monit
     }
   timeout (3, 5) 
   group server

Finally an example with all statements:

 check apache with pidfile /var/run/httpd.pid
   group server
   start program = "/etc/init.d/httpd start"
   stop program  = "/etc/init.d/httpd stop"
   checksum /usr/local/apache/bin/httpd
    and expect the sum 8f7f419955cefa0b33a2ba316cba3659
   host www.sol.no     port 80  type tcp protocol http
   host kvasir.sol.no  port 80  type tcp protocol http 
      and use the request "/login.cgi"
   host shop.sol.no    port 443 type tcp # default protocol test 
   timeout (2,3) 
   if cpuusage is greater than 60.0 for 2 cycles then alert
   if cpuusage > 80.0 for 5 cycles then restart
   if memkbyte > 100000 then stop
   alert foo@bar on { checksum } 
   alert bofh@bar on { restart, timeout } with
     mail-format { from: monit@foo.bar.no }
   every 2 cycles
   autostart yes

Note; only the check- and pidfile statement are mandatory, the other statements are optional and the order of the optional statements is not important.


FILES

~/.monitrc Default run control file

./monitrc If the control file is not found in the default location, and the current working directory contains a monitrc file, this file is used instead.

~/.monitrc.pid Lock file to help prevent concurrent runs (non-root mode).

/var/run/monit.pid Lock file to help prevent concurrent runs (root mode, Linux systems).

/etc/monit.pid Lock file to help prevent concurrent runs (root mode, systems without /var/run).


SIGNALS

If a monit daemon is running, SIGUSR1 wakes it up from its sleep phase and forces a poll of all processes. SIGTERM will gracefully terminate a monit daemon. This signal is sent to a monit daemon if monit is started with the quit action argument.

Running monit in foreground while a background monit daemon is running will wake up the daemon.


NOTES

This is a very silent program. Use the -v switch if you want to see what monit is doing, and tail -f the logfile.

The syntax (and parser) of the control file is inspired by Eric S. Raymond et al. excellent fetchmail program. Some portions of this man page does also receive inspiration from the same authors.


AUTHORS

Jan-Henrik Haukeland <hauk@tildeslash.com>, Martin Pala <martin.pala@hq.iol.cz>, Christian Hopp <chopp@iei.tu-clausthal.de>, Rory Toma <rory@digeo.com>, Thomas Oppel <oppel@kbis.de>

See also http://www.tildeslash.com/monit/who.html


COPYRIGHT

Copyright (C) 2000-2002 by Contributors to the monit codebase. All Rights Reserved. This product is distributed in the hope that it will be useful, but WITHOUT any warranty; without even the implied warranty of MERCHANTABILITY or FITNESS for a particular purpose.


SEE ALSO

GNU text utilities; md5sum(1)