Commit 06cf5395 authored by Paul Rich's avatar Paul Rich

Merge branch 'develop' into 178-lock-split-uns

parents d658914d b7c39487
......@@ -23,13 +23,19 @@ If a configuration value definition is mandatory, that will be noted.
.SS General Sections
.SS "[components]"
.TP
.B python
The path to the python interpreter to use. If omitted, the default is
.I /usr/bin/python
.TP
.B service-location
The url:port of the service-locator component (slp). The default port is 8256.
This must be specified for a given Cobalt instance
.TP
.B python
The path to the python interpreter to use. If omitted, the default is
.I /usr/bin/python
.B sleeptime
This sets the base sleeptime for all components for automatic methods. This
is the floor for all components. This may be overridden with the same option
in each component section. The time is a floating point value in seconds.
The default interval is 0.01 sec.
.SS "[communication]"
SSL configuration for Cobalt. These must be specified per-install
......@@ -261,6 +267,12 @@ Location of file for site-defined utility functions.
If true, send messages to CobaltDB, or cache the messages that would be sent
if the CobaltDB writer is currently unavailable for later writing. The default
is false
.TP
.B sleeptime
This sets the base sleeptime for automatic methods in this component. This
is the floor for all components. This may be overridden with the same option
in each component section. The time is a floating point value in seconds.
The default interval is the [components] section sleeptime.
.SS "[cqm]"
These are options for the queue-manager component, cqm. Cqm handles queueing
......@@ -404,6 +416,12 @@ situations where there is an ongoing cleanup or other system issue where
a job may be running long. This only affects display of Est_Start_Time in
.BR qstat (1)'s
display. The default is 300.0 seconds.
.TP
.B sleeptime
This sets the base sleeptime for automatic methods in this component. This
is the floor for all components. This may be overridden with the same option
in each component section. The time is a floating point value in seconds.
The default interval is the [components] section sleeptime.
.SS "[cdbwriter]"
.TP
......@@ -430,6 +448,12 @@ Cobalt instances.
The minimum interval that Cobalt will wait between saving statefiles for this
component, in seconds. By default the interval is 10.0 seconds. Under periods
of high load on the component, the interval between statefiles may be longer.
.TP
.B sleeptime
This sets the base sleeptime for automatic methods in this component. This
is the floor for all components. This may be overridden with the same option
in each component section. The time is a floating point value in seconds.
The default interval is the [components] section sleeptime.
.SS Cluster System Sections
.SS "[cluster_system]"
......@@ -509,6 +533,12 @@ draining is permitted for improved utilization of cluster resources.
.B minimum_backfill_window
This is the minimum amount of backfill time to set for a set of resources that
being cleaned by post-job epilogue scripts. The default is 300 seconds.
.TP
.B sleeptime
This sets the base sleeptime for automatic methods in this component. This
is the floor for all components. This may be overridden with the same option
in each component section. The time is a floating point value in seconds.
The default interval is the [components] section sleeptime.
.SS BlueGene/P Sections
......@@ -542,6 +572,12 @@ The type of BlueGene being run on. For BlueGene/Q this should be set to 'bgp'.
Enables an extra function to place the system component under high-communication
stress for race-condition debugging and fault-handling testing if True. This is
False by default. This applies only to the brooklyn system simulation environment.
.TP
.B sleeptime
This sets the base sleeptime for automatic methods in this component. This
is the floor for all components. This may be overridden with the same option
in each component section. The time is a floating point value in seconds.
The default interval is the [components] section sleeptime.
.SS BlueGene/Q Sections
.SS "[bgpm]"
......@@ -622,6 +658,12 @@ command. The default is 300 seconds.
.TP
.B bgtype
The type of BlueGene being run on. For BlueGene/Q this should be set to 'bgq'.
.TP
.B sleeptime
This sets the base sleeptime for automatic methods in this component. This
is the floor for all components. This may be overridden with the same option
in each component section. The time is a floating point value in seconds.
The default interval is the [components] section sleeptime.
.SS "CRAY SECTIONS"
.SS "[alps]"
......@@ -664,6 +706,12 @@ seconds. The default is 300 seconds.
.B update_thread_timeout
The polling interval for state updates from ALPS in seconds. The default is
10 seconds.
.TP
.B sleeptime
This sets the base sleeptime for automatic methods in this component. This
is the floor for all components. This may be overridden with the same option
in each component section. The time is a floating point value in seconds.
The default interval is the [components] section sleeptime.
.SS [capmc]
.TP
.B path
......
......@@ -52,10 +52,29 @@ def state_file_location():
'''
return os.path.expandvars(get_config_option('statefiles', "location", "/var/spool/cobalt"))
def get_sleeptime(my_name):
'''Return what sleeptime should be for the base level automatic threads loop.
Checks the components section first, then the named component section.
Args:
my_name: the string identifier for this component
Returns:
A floating point time interval in seconds. The default is 0.01 sec.
Note:
Any automatic timing value lower than this value will be effectively overridden by this value.
'''
sleeptime = float(get_config_option("components", "sleeptime", 0.01))
comp_sleeptime = get_config_option(my_name, "sleeptime", None)
if comp_sleeptime is not None:
return float(comp_sleeptime)
return sleeptime
def run_component (component_cls, argv=None, register=True, state_name=False,
cls_kwargs={}, extra_getopt='', time_out=10.0,
single_threaded=False, seq_num=0, aug_comp_name=False,
state_name_match_component=False, sleeptime=10.0):
state_name_match_component=False):
'''Run the specified Cobalt component until recieving signal to terminate.
Args::
......@@ -126,6 +145,7 @@ def run_component (component_cls, argv=None, register=True, state_name=False,
logging.getLogger().setLevel(level)
Cobalt.Logging.setup_logging(component_cls.implementation, console_timestamp=True)
if daemon:
child_pid = os.fork()
if child_pid != 0:
......@@ -170,10 +190,13 @@ def run_component (component_cls, argv=None, register=True, state_name=False,
certpath = None
capath = None
sleeptime = get_sleeptime(component_cls.implementation)
component.logger.info("Component sleep interval set to %s", sleeptime)
if single_threaded:
# sleeptime is not used due to differences in api.
server = BaseXMLRPCServer(location, keyfile=keypath, certfile=certpath,
cafile=capath, register=register, timeout=time_out)
server = BaseXMLRPCServer(location, keyfile=keypath, certfile=certpath,
cafile=capath, register=register, timeout=time_out, sleeptime=sleeptime)
else:
server = XMLRPCServer(location, keyfile=keypath, certfile=certpath,
cafile=capath, register=register, timeout=time_out, sleeptime=sleeptime)
......
......@@ -314,7 +314,8 @@ class BaseXMLRPCServer (SSLServer, CobaltXMLRPCDispatcher, object):
keyfile=None, certfile=None,
timeout=10,
logRequests=False,
register=True, allow_none=True, encoding=None, cafile=None):
register=True, allow_none=True, encoding=None, cafile=None,
sleeptime=0.01):
"""Initialize the XML-RPC server.
......@@ -347,6 +348,7 @@ class BaseXMLRPCServer (SSLServer, CobaltXMLRPCDispatcher, object):
self.register_function(self.ping)
self.logger.info("service available at %s" % self.url)
self.timeout = timeout
self.sleeptime=sleeptime
def register_instance (self, instance, *args, **kwargs):
......@@ -410,7 +412,7 @@ class BaseXMLRPCServer (SSLServer, CobaltXMLRPCDispatcher, object):
except:
self.logger.error("Unexpected task failure", exc_info=1)
# this causes delays such as in control-c
Cobalt.Util.sleep(self.timeout)
Cobalt.Util.sleep(self.sleeptime)
except:
self.logger.error("tasks_thread failed", exc_info=1)
......@@ -497,8 +499,7 @@ class XMLRPCServer (SocketServer.ThreadingMixIn, BaseXMLRPCServer):
BaseXMLRPCServer.__init__(self, server_address, RequestHandlerClass, keyfile,
certfile, timeout, logRequests, register, allow_none, encoding, cafile=cafile,
)
self.sleeptime=sleeptime
sleeptime=sleeptime)
self.task_thread = threading.Thread(target=self._tasks_thread)
#FIXME: this will fail if a get is called before self._register is defined
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment