dsh

DSH(1)                     BSD General Commands Manual                    DSH(1)

NAME
     dsh — run a command on a cluster of machines

SYNOPSIS
     dsh [-eiqtv] [-f fanout] [-g rungroup1,...,rungroupN] [-l username]
         [-o porttimeout] [-p portnum] [-w node1,...,nodeN] [-x node1,...,nodeN]
         [command ...]
     dsh [-eiqtv] [-f fanout] [-g rungroup1,...,rungroupN] [-l username]
         [-o porttimeout] [-p portnum] [-w node1,...,nodeN] [-x node1,...,nodeN]
         -s scriptname [arguments ...]

DESCRIPTION
     The dsh utility can be used to run a command, or group of commands on a
     cluster of machines.  All commands are run in parallel, on the cluster.
     Interrupt signals will be sent to the remote host that is currently being
     displayed to the user.  The following options are available:

     -e   Unless the -e option is specified, stderr from remote commands will
          not be reported to the user.

     -f   If the -f option is specified, followed by a number, it sets the
          fanout size of the cluster.  The fanout size is the number of nodes a
          command will run on in parallel at one time.  Thus a 80 node cluster,
          with a fanout size of 64, would run 64 nodes in parallel, then, when
          all have finished, it would execute the command on the last 16 nodes.
          The fanout size defaults to 64.  This option overrides the FANOUT
          environment variable.

     -g   If the -g option is specified, followed by a comma separated list of
          group names, the command will only be run on that group of nodes.  A
          node may be a part of more than one group if desired, however running
          without the -g option will run the command on the same node as many
          times as it appears in the file specified by the CLUSTER environment
          variable.  This option is silently ignored if used with the -w option.

     -i   The -i option will list information about the current cluster, and
          command groupings.  It will print out the current value of the fanout,
          and how many groups of machines there are within the cluster. It will
          also show you which command you are about to run, and your username if
          specified with the -l option.

     -l   If the -l option is specified, followed by a username, the commands
          will be run under that userid on the remote machines.  Consideration
          must be taken for proper authentication, for this to work.

     -o   The -o option is used to set the timeout in seconds to be used when
          testing remote connections.  The default is five seconds.

     -p   The -p option can be used to set the port number that testing should
          occur on when testing remote connections.  The default behavior is to
          guess based on the remote command name.

     -q   The -q option does not issue any commands, but displays information
          about the cluster, and the fanout groupings.

     -s   The -s option causes dsh to copy a script to the remote machine,
          execute it once, and delete it, all in a single operation.  The -s
          option requires a script name, which will be copied to all remote
          machines and executed.  You may also optionally specify any number of
          additional arguments to the script on the command line.  The script
          will be placed in a temporary directory under /tmp on the remote node,
          executed, and then the directory will be recursively deleted.  Any
          executable can be used as the script, regardless of programming
          language.  The script is copied with the tar command, preserving
          permissions of the original.  The -s option cannot be used with the
          standard mode of dsh to run other commands, nor can it be used in
          interactive mode.

     -t   The -t option causes dsh to attempt a connection test to each node
          prior to attempting to run the remote command.  If the test fails for
          any reason, the remote command will not be attempted.  This can be
          useful when clusterfiles have suffered bitrot and some nodes no longer
          exist, or might be down for maintenance.  The default timeout is 5
          seconds.  The timeout can be changed with the -o option.  dsh will
          attempt to guess the port number of the remote service based on your
          RCMD_CMD setting.  It knows about ssh and rsh.  If dsh fails to guess
          your port correctly, you may use the -p argument to set the remote
          port number.  If the RCMD_TEST environment variable exists, the
          testing will automatically take place.

     -v   Prints the version of ClusterIt to the stdout, and exits.

     -w   If the -w option is specified, followed by a comma delimited list of
          machine names, the command will be run on each node in the list.
          Without this option, dsh runs on the nodes listed in the file pointed
          to by the CLUSTER environment variable.

     -x   The -x option can be used to exclude specific nodes from the cluster.
          The format is the same as the -w option, a comma delimited list of
          machine names.  This option is silently ignored if used with the -w
          option.

ENVIRONMENT
     dsh utilizes the following environment variables.

     CLUSTER            Contains a filename, which is a newline separated list
                        of nodes in the cluster.

     RCMD_CMD           Command to use to connect to remote machines.  The
                        command chosen must be able to connect with no password
                        to the remote host.  Defaults to rsh

     RCMD_CMD_ARGS      Arguments to pass to the remote shell command.  Defaults
                        to none.

     RCMD_PORT          The port number used to test remote connections.  See
                        the -p flag.

     RCMD_TEST          When set, dsh will automatically test all hosts before
                        launching the remote command. See the -t option for more
                        information.

     RCMD_TEST_TIMEOUT  The timeout in seconds to use when testing for remote
                        connections.

     RCMD_USER          The username to connect to remote machines as by
                        default.

     FANOUT             When set, limits the maximum number of concurrent
                        commands sent at once.  This can be used to keep from
                        overloading a small host when sending out commands in
                        parallel.  Defaults to 64.  This environment setting can
                        be overridden by the -f option.

FILES
     The file pointed to by the CLUSTER environment variable has the following
     format:

           pollux
           castor
           GROUP:alpha
           rigel
           kent
           GROUP:sparc
           alshain
           altair
           LUMP:alphasparc
           alpha
           sparc

     This example would have pollux and castor a member of no groups, rigel and
     kent a member of group 'alpha', and alshain and altair a member of group
     ‘sparc’.  Note the format of the GROUP command, it is in all capital
     letters, followed by a colon, and the group name.  There can be no spaces
     following the GROUP command, or in the name of the group.

     There is also a LUMP command, which is identical in syntax to the GROUP
     command.  This command allows you to create a named group of groups.  Each
     member of the lump is the name of a group.  The LUMP command is terminated
     by another LUMP or GROUP command, or the EOF marker.

     Any line beginning with a ‘#’ symbol denotes a comment field, and the
     entire line will be ignored.  Note that a hash mark placed anywhere other
     than the first character of a line, will be considered part of a valid
     hostname or command.

EXAMPLES
     The command:

           dsh hostname

     will display:

           pollux: pollux
           castor: castor

     if the file pointed to by CLUSTER contains:

           pollux
           castor

     The command:

           dsh -w hadar,rigel hostname

     will display:

           hadar:  hadar
           rigel:  rigel

     The command:

           dsh -w hadar,rigel -s /bin/date

     Will copy /bin/date to /tmp/dsh.$$ on hadar and rigel and execute it on
     each node, displaying the date and time on each remote machine, assuming
     that the /bin/date you copied is a valid binary for the remote end.

DIAGNOSTICS
     Exit status is 0 on success, 1 if an error occurs.

SEE ALSO
     dshbak(1), pcp(1), pdf(1), prm(1), rsh(1), tar(1), kerberos(3),
     hosts.equiv(5), rhosts(5)

HISTORY
     The dsh command appeared in clusterit 1.0. It is based on the dsh command
     in IBM PSSP.

AUTHOR
     Dsh was written by Tim Rightnour.

BUGS
     Solaris 2.5.1 has a maximum of 256 open file descriptors.  This means that
     dsh will fail on a fanout size greater than about 32-40 nodes.

                                 January 9, 2007