DESCRIPTION
The Cobalt configuration file is an "ini-style" configuration file. This configuration file has sections for all Cobalt components and clients in a given instance. The general format of a section is:
[section]
Values that are lists are ":"-delimited. In the event that a key is defined multiple times in a section, the value of the last key in the section will be the value used. Comments may be made in a file by beginning a like with a '#'. Comments must not be inline with key-value pairs. The sections that follow describe the various sections and their options.
If a configuration value definition is mandatory, that will be noted.
[components]
- service-location
-
The url:port of the service-locator component (slp). The default port is 8256. This must be specified for a given Cobalt instance
- python
-
The path to the python interpreter to use. If omitted, the default is /usr/bin/python
[communication]
SSL configuration for Cobalt. These must be specified per-install
- key
-
Key to use for SSL communication. May be generated via openssl(1)
- cert
-
This is a locally stored certification for authenticating the key.
- ca
-
Certification authority for the
entry. This is typically the same as
- password
-
Required to be set in configuration file. This is a shared secret for all Cobalt daemons and clients to use, and is the password required for Cobalt’s internal
communication.
[statefiles]
Options for Cobalt’s statefile persistence.
- location
-
Path to where the statefiles are stored.
[system]
Common system configuration settings. These apply to all types of systems.
- elogin_hosts
-
A ':'-delimited list of hostnames of hosts that users can qsub interactive jobs on that then have to be run on another node. This is ususally due to restrictions in the authentication and authorization mechanisms for the mpirun equivalent on a given system. This is most commonly required for Cray systems using eLogin nodes.
- resource_name
-
This is the resource name for this Cobalt instance for accounting purposes. This is typically the name of the cluster being run. If unspecified the resource name for accounting logs is "NOTSET".
- size
-
Maximum size of a given system in nodes.
[forker]
Common option for Cobalt forker components.
- ignore_setgroup_errors
-
Default false. If set to true, then setuid/setgid failures will not kill jobs. Necessary for running local or subinstances of Cobalt, as well as running any non-root simulation mode. If this is set and the forker components are not running as root, this will cause any job ran to be run as the user to run as the user that the forker component is running as.
- pipe_buffsize
-
The size in bytes of the buffer to use for reading from a pipe connected to a child process standard out. Increase this if you are getting particularly large messages from standard out. This is most likely to occur with large Cray systems using the BASIL interface to ALPS. Default 16777216 bytes.
- save_me_interval
-
The minimum interval that Cobalt will wait between saving statefiles for this component, in seconds. By default the interval is 10.0 seconds. Under periods of high load on the component, the interval between statefiles may be longer.
- use_cgroups
-
The default is false. If enabled, cgroup support is enabled. When cgroup support is enabled, Cobalt will use cgclassify during script process startup to place Cobalt initiated scripts into an administrator-specified cgroup. This is generally used if proc connector is disabled on a given system. Cgroup-related options may be set on a per forker-type basis, or on a per- instance basis. This is currently supported for Cray systems, and for site- specified system scripts, such as job and resource pre and postscripts.
- cgroup_failure_fatal
-
The default is false. If cgclassify fails to set a cgroup and cgroup_failure_fatal is set to true, then script startup will fail and the process will exit with a nonzero status.
- cgclassify_path
-
This is the path to the cgclassify executable. By default this is /usr/bin/cgclassify. Use this if cgclassify is in a different location.
- cgclassify_args
-
Arguments to pass to cgclassify. No default arguments are provided by Cobalt. See cgclassify(1) for information on options to cgclassify.
[forker.system]
This applies cgroup options to the system_script_forker. Any options not specified here will default to the values set by the general forker section. These options will affect any auxiliary scripts that Cobalt from the system or queue-manager components. These options will not be applied to any user-provided scripts.
- use_cgroups
-
If true, cgclassify support is enabled.
- cgroup_failure_fatal
-
If cgclassify fails to set a cgroup and cgroup_failure_fatal is set to true, then script startup will fail and the process will exit with a nonzero status.
- cgclassify_path
-
This is the path to the cgclassify executable. By default this is /usr/bin/cgclassify. Use this if cgclassify is in a different location.
- cgclassify_args
-
Arguments to pass to cgclassify. No default arguments are provided by Cobalt. See cgclassify(1) for information on options to cgclassify.
[forker.alps]
This applies cgroup options to all alps_script_forker instances. Any options not specified here will default to the values set by the general forker section. These options will be applied to all alps_script_forkers that are not individually overridden. These options will affect all user-run jobs on a Cray system.
- use_cgroups
-
If true, cgclassify support is enabled.
- cgroup_failure_fatal
-
If cgclassify fails to set a cgroup and cgroup_failure_fatal is set to true, then script startup will fail and the process will exit with a nonzero status.
- cgclassify_path
-
This is the path to the cgclassify executable. By default this is /usr/bin/cgclassify. Use this if cgclassify is in a different location.
- cgclassify_args
-
Arguments to pass to cgclassify. No default arguments are provided by Cobalt. See cgclassify(1) for information on options to cgclassify.
[forker.<alps_script_forker_instance_name>]
Applies these configuration options to an individual forker instance. If these are not defined then the values used or passed along by the "[forker.alps]" section will be used.
- use_cgroups
-
If true, cgclassify support is enabled.
- cgroup_failure_fatal
-
If true, if cgclassify fails to set a cgroup, then script startup will fail and the process will exit with a nonzero status.
- cgclassify_path
-
This is the path to the cgclassify executable. By default this is /usr/bin/cgclassify. Use this if cgclassify is in a different location.
- cgclassify_args
-
Arguments to pass to cgclassify. No default arguments are provided by Cobalt. See cgclassify(1) for information on options to cgclassify.
[logger]
This section handles cobalt component logging and default levels. Valid logging levels in this section are
and
- to_syslog
-
If true, send logging data to the syslog daemon.
- syslog_level
-
Only send messages to syslog at this level or higher. The default level is INFO
- syslog_location
-
Location of logfile
- syslog_facility
-
Logger facility to send logs to. The default is local0
- to_console
-
Send logging data to console or stdout/stderr as appropriate. This defaults to true.
- console_level
-
Only send messages to the console at this level or higher. The default level is INFO
[bgsched]
- default_reservation_policy
-
If set, this is the score accrual policy that will be used on reservation queues. The default policy is "default" (fifo).
- db_flush_interval
-
The minimum frequency with which messages are sent to the database component. use_db_logging must be set to true, and the default interval is 10 seconds. log_dir The directory to place reservation accounting logs.
- overflow_file
-
This is a file location to use for holding database messages should use_db_logging be set to true, but the CobaltDB writer component is unavailable for an extended period of time. If this file is present, then on cdbwriter startup, messages from this file will be pushed to the component and added to the database, followed by in-memory pending messages.
- max_queued_messages
-
This is the number of messages to keep in memory before flushing to the
If set to -1, the component will never flush to the overflow file. If this is not set, then the overflow file will not be used.
- save_me_interval
-
The minimum interval that Cobalt will wait between saving statefiles for this component, in seconds. By default the interval is 10.0 seconds. Under periods of high load on the component, the interval between statefiles may be longer.
- schedule_jobs_interval
-
This is the minimum interval between iterations of the scheduling loop. The default time is 10 seconds.
- utility_file
-
Location of file for site-defined utility functions.
- use_db_logging
-
If true, send messages to CobaltDB, or cache the messages that would be sent if the CobaltDB writer is currently unavailable for later writing. The default is false
[cqm]
These are options for the queue-manager component, cqm. Cqm handles queueing and overall job tracking operations.
- filters
-
A colon-delimited list of paths to scripts to run. These are run by the clients that work with cqm(8), specifically, qsub(1), qalter(1), and qmove(1). These are invoked from the clients and these scripts must run return an exit status of 0 prior to the job, or job modification being passed into cqm. These are intended as site-specific validation scripts. Scripts recieve job parameters as key=value pairs as arguments, and any key=value pairs written to stdout will modify job parameters accordingly, for instance a non-default initial score of 500 may be written to stdout as score=500. If a job would fail to pass the filter entirely, then it should return a nonzero exit status. A note as to which filter failed should be presented to the user. It should be noted that cqadm(1) as an admin-level command does not run these filters. Since the filters are invoked as a part of client invocation, any change to this parameter to a running Cobalt instance will have an immediate effect without signaling or restart.
- job_prescripts
-
A colon-delimited list of scripts to run when the job is scheduled, but prior to job invocation. These are run once per job, whether or not it is preempted. Nonzero exit statuses in these scripts are fatal to a job starting up.
- job_postscripts
-
A colon-delimited list of scripts to run after the job has ended. These are run once per job, whether or not it is preempted. Nonzero exit statuses in these scripts have no effect on a job.
- resource_prescripts
-
A colon-delimited list of scripts to run when the job is scheduled, but prior to job invocation. These are run once per task, prior to resuming from preemption. Nonzero exit statuses in these scripts are fatal to a job starting up.
- resource_postscripts
-
A colon-delimited list of scripts to run after the job has ended. These are run after each preemption step. Nonzero exit statuses at the end of a job in these scripts have no effect on a job.
- dep_frac
-
The floating-point fraction of a job’s score that a dependent job inherits. This sets a default value and may be overridden on a per-job basis by the schedctl(1) command. The default is 0.5.
- scale_dep_frac
-
If set to true, the dependency fraction inherited by jobs will be modified by the ratio of the size of the resources the dependent job to the job it is inheriting score from. This only applies to dependent jobs that are smaller than the job they are inheriting from. For instance, a 4 node job depending on an 8 node job would inherit half the score fraction than an 8 node job that depended on an 8-node job.
- mailserver
-
The address of the mailserver to use for sending admin emails and requested user emails for startup and termination notification.
- force_kill_delay
-
The length of time, in seconds, to wait between sending a SIGTERM and a SIGKILL to a job. The default is 300 seconds.
- log_dir
-
The directory to place job accounting logs.
- overflow_file
-
This is a file location to use for holding database messages should use_db_logging be set to true, but the CobaltDB writer component is unavailable for an extended period of time. If this file is present, then on cdbwriter startup, messages from this file will be pushed to the component and added to the database, followed by in-memory pending messages.
- max_queued_messages
-
This is the number of messages to keep in memory before flushing to the
If set to -1, the component will never flush to the overflow file. If this is not set, then the overflow file will not be used.
- save_me_interval
-
The minimum interval that Cobalt will wait between saving statefiles for this component, in seconds. By default the interval is 10.0 seconds. Under periods of high load on the component, the interval between statefiles may be longer.
- utility_file
-
Location of file for site-defined utility functions.
- use_db_logging
-
If true, send messages to CobaltDB, or cache the messages that would be sent if the CobaltDB writer is currently unavailable for later writing. The default is false
- poll_process_groups_interval
-
The interval in seconds between queries to the system component for process group status.
- use_db_jobid_generator
-
If true, use CobaltDB to generate a unique jobid. This may be used to ensure unique jobids across multiple Cobalt instances on related resources. Default false.
- progress_interval
-
The minimum time in seconds between job statemachine steps. Default 10 seconds.
- max_walltime
-
If set, defines a general maximum requested walltime for all queues. May be overriden by setting the MaxWalltime property on a given queue. If this is not set, then there is no default limit on the length of time a user job may request, unless explicitly set as a part of a given queue.
- compute_utility_interval
-
The minimum time in seconds to wait between score calculation iterations. The default is 10 seconds.
- cqstat_header
-
A colon-delimited list of display headers to use in qstat(1)'s default display. A default set of headers will be used if this is not set.
- cqstat_header_full
-
A colon-delimited list of display headers to use with qstat(1)'s -f flag. If not set, a default set of display headers are used. This does not change the -f -l combination for display.
- starttime_estimate_shadow
-
A floating point time to add to the current time for a minimum start time estimate. This will force a minimum start time in the future to handle situations where there is an ongoing cleanup or other system issue where a job may be running long. This only affects display of Est_Start_Time in qstat(1)'s display. The default is 300.0 seconds.
[cdbwriter]
- log_dir
-
The directory to place cdbwriter message overflow files.
- user
-
The user to connect to DB2. It is recommended to use a user identity that only has access to the Cobalt database. This user requires read, write, and update permissions on the Cobalt database.
- pwd
-
This is the password that the user will use to connect to the Cobalt database.
- database
-
The name of the database in DB2 to connect to that contains the Cobalt database.
- schema
-
The name of the DB2 schema where the Cobalt database resides. Multiple schemas may exist in the same database, which is useful for handling multiple, related, Cobalt instances.
- save_me_interval
-
The minimum interval that Cobalt will wait between saving statefiles for this component, in seconds. By default the interval is 10.0 seconds. Under periods of high load on the component, the interval between statefiles may be longer.
[cluster_system]
- simulation_mode
-
Set the cluster_system component to run in a simulation mode. In this mode, The cluster system will not actually run jobs on target nodes in its configuration, but it will instead run the
which will provide statistics on what would have ran. Otherwise the system component will track and allocate resources as though it was actually running on a multi-node cluster, with a confguration sprcified in the
entry if true. This defaults to false.
- simulation_executable
-
Instead of running pre and postscripts, run the specified executable. This must be specified if running in simulation_mode. Output from this script is logged to the cluster_system component’s logs.
- run_remote
-
If set to false, do not attempt to run pre/postscripts on remote resources. The default is true.
- hostfile
-
This is a list of hostnames for nodes that the cluster system component can schedule. Nodes may be added or removed, and the list of available nodes is updated at restart.
- epilogue
-
This is a colon-delimited set of scripts to run on a per-node basis on task termination on a resource. If any script returns a non-zero exit status, the node will be marked down, and no new jobs will be scheduled on that resource.
- epilogue_timeout
-
The amount of time in seconds to wait for each script to complete. If the script has not completed and exited with a status of 0 before this timeout is reached, that node will be marked down.
- prologue
-
Not currently used. Per-node scripts are currently launched as a part of the cqm(8) resource_prologue
- prologue_timeout
-
This is not currently used within the cluster system component
- allocation_timeout
-
This is the time in seconds to wait when resources are allocated, but have not had a job started on them. This usually occurs when a user deletes a job while it is starting up. After this timeout has elapsed the resources will be returned to the pool of available nodes, and a new job may be scheduled on the resources. The default timeout is 300 seconds.
- drain_mode
-
This sets the backfill mode to use and may be one of backfill, drain_only, or first_fit. The first_fit mode will run the highest scored job that can immediately run on resources available. The drain_only mode will run the highest scored job, if sufficient resources are available or it will start draining nodes and then run the job once sufficient resources are available. The backfill mode will run and drain resources as the drain_only mode, but will also attempt to run jobs on the empty, but draining nodes in a score-order first-fit manner. It is recommended that backfill be used if draining is permitted for improved utilization of cluster resources.
- minimum_backfill_window
-
This is the minimum amount of backfill time to set for a set of resources that being cleaned by post-job epilogue scripts. The default is 300 seconds.
[bgpm]
- mmcs_server_ip
-
The IP address of the BlueGene mmcs_server.
- mpirun
-
The location of the BlueGene mpirun binary. This is typically /bgsys/drivers/ppcfloor/bin/mpirun
[bgsystem]
- kernel
-
If true, allow the use of alternative kernels
- bootprofiles
-
This is a path to the directory that holds the alternate kernel subdirectories. If alternate kernel support is being used, then this must be set. This is the location of where symlinks to the current profiles of
- partitions
-
should be made. Cobalt will autogenerate these symlinks as a part of the boot process on an as-needed basis.
- bgtype
-
The type of BlueGene being run on. For BlueGene/Q this should be set to 'bgp'.
- stress_comm_code
-
Enables an extra function to place the system component under high-communication stress for race-condition debugging and fault-handling testing if True. This is False by default. This applies only to the brooklyn system simulation environment.
[bgpm]
- runjob
-
The location of the BlueGene runjob binary. This is typically /bgsys/drivers/ppcfloor/bin/runjob
[bgsystem]
- allow_alternate_kernels
-
If set to true, allow alternate kernels to be run by users using the --kernel or --io_kernel flags to qsub(1). This defaults to false.
- bootprofiles
-
This is a path to the directory that holds the alternate kernel subdirectories. If alternate kernel support is being used, then this must be set. This is the location of where symlinks to the current profiles of
- partitions
-
should be made. Cobalt will autogenerate these symlinks as a part of the boot process on an as-needed basis.
- default_kernel
-
The default compute-node kernel image to use. This name should be a directory found at the path indicated by
This value is set to 'default' by default.
- default_kernel_options
-
A list of options to pass to the default kernel image.
- ion_default_kernel
-
The default IO-node kernel image to use. This name should be a directory found at the path indicated by ion_default_kernel_options A list of options to pass to the default kernel image.
This value is set to 'default' by default.
- subblock_prefix
-
This is a location prefix to attach to subblock names. Usually this is the resource’s prefix for the Cobalt instance. The default for subblock use is "COBALT".
- subblock_config
-
Sets a configuration for subblock use. This is a key-value list of the form: + _ "[blockname1:min_size1],[blockname2:min_size2],…" _ + Blocks must be specified in the BlueGene control system. Pseudoblocks will be generated down to the specified minimum size. Valid minimum sizes are 64, 32, 16, 8, 4, 2, 1. Subblock geometries are per-IBM’s recommendations in runjob(1) where appropriate. If + is specified then + may also be specified.
- ignore_subblock_sizes
-
A colon-delimited list of sizes to skip when generating pseudoblocks for automatic subblock use.
- terminal_boot_timeout
-
Sets an automatic timeout in seconds for block boots initiated by Cobalt’s boot_block(1) command. The default is 300 seconds.
- bgtype
-
The type of BlueGene being run on. For BlueGene/Q this should be set to 'bgq'.
[alps]
- basil
-
The path to Cray’s apbasil command. The default path is /opt/cray/default/alps/bin/apbasil
- apkill
-
The path to Cray’s apkill command. The default path is /opt/cray/alps/default/bin/apkill
- cray_mom_qsub
-
The path to qsub on the mom (or other alps_script_forker) nodes to use when using interactive qsub from the eLogin hosts on Cray systems. This must be a fully qualified path. The default is /usr/bin/qsub
- default_depth
-
The default processors per node. This should be set to the number of KNL cores on each node for XC40 systems. The default value is 72.
[alpssystem]
- min_ssd_size
-
The size of the smallest SSD available on the system in GB.
- pgroup_startup_timeout
-
The time to allow for process group startup in seconds. The default is 120 seconds.
- save_me_interval
-
The minimum interval that Cobalt will wait between saving statefiles for this component, in seconds. By default the interval is 10.0 seconds. Under periods of high load on the component, the interval between statefiles may be longer.
- temp_reservation_time
-
The default time for the temporary allocation reservation for starting jobs in seconds. The default is 300 seconds.
- update_thread_timeout
-
The polling interval for state updates from ALPS in seconds. The default is 10 seconds.
[capmc]
- path
-
Path to CAPMC command front-end. If unset, the default is /opt/cray/capmc/default/bin/capmc
[system]
- backfill_epsillon
-
Set the amount of time to subtract from the remaining drain window, in seconds, when placing backfill jobs. This allows time for cleanup for backfill jobs to prior to the exit time of the job causing the drain to occur. The default is 120 seconds.
- cleanup_drain_window
-
Set the draining time to set for nodes in cleanup statuses. The time is in seconds. The default time is 300 seconds.
- drain_mode
-
Set the draining algorithm to use. This may be backfill or first-fit. The default is first-fit.
FILES
- /etc/cobalt.conf
-
This is the default location for the configuration file used by all Cobalt daemons and clients. Due to the potential for abuse of the
interfaces, access to this file should be carefully controlled. This file does not to be writable under normal conditions, and only must be readable by the user used by Cobalt’s setgid wrappers. By default, this is the
user.