- 09 Jan, 2019 1 commit
-
-
Valentin Reis authored
the gitlab-ci.yml file now points to argotest/gitlab/basic.yml on master.
-
- 04 Jan, 2019 2 commits
-
-
Valentin Reis authored
-
Valentin Reis authored
This includes renaming "progress" to performance in argo_perf_wrapper. There are two distincts keywords in the messaging layer: "performance" for all things related to hardware, and "progress", for all things relating to the application.
-
- 21 Dec, 2018 3 commits
-
-
Swann Perarnau authored
Doesn't work with the new downstream API.
-
Swann Perarnau authored
Replace the downstream API handling by the new messaging layer. Not that we don't have a clean way to deal with dynamic concurrency control using this API, so we disable the handling of it for now.
-
Swann Perarnau authored
Add downstream RPC client/server classes that are the same as the upstream ones. This is part of a series of changes to downstream to allow for more reliable communications between the daemon and applications. At this time, the daemon never replies, so the RPC_REQ is basically used as a way to publish events to the daemon.
-
- 18 Dec, 2018 1 commit
-
-
Swann Perarnau authored
Add a config option to specify the location of the PMPI LD_PRELOAD library available in libnrm. This should make it easier to use this library.
-
- 12 Dec, 2018 2 commits
-
-
Valentin Reis authored
Fixing a bug introduced by the 'progress-report' branch in a recent previous commit. The process object is the result of a tornado spawn, so the call has to be slightly different than what was there.
-
Valentin Reis authored
This commit adds a command-line interface to `daemon`: ``` usage: daemon [-h] [-c FILE] [-d] [--nrm_log NRM_LOG] [--hwloc HWLOC] [--argo_nodeos_config ARGO_NODEOS_CONFIG] [--perf PERF] [--argo_perf_wrapper ARGO_PERF_WRAPPER] optional arguments: -h, --help show this help message and exit -c FILE, --configuration FILE Specify a config json-formatted config file to override any of the available CLI options. If an option is actually provided on the command-line, it overrides its corresponding value from the configuration file. -d, --print_defaults Print the default configuration file. --nrm_log NRM_LOG Main log file. Override default with the NRM_LOG. environment variable --hwloc HWLOC Path to the hwloc to use. This path can be relative and makes uses of the $PATH if necessary. Override default with the HWLOC environment variable. --argo_nodeos_config ARGO_NODEOS_CONFIG Path to the argo_nodeos_config to use. This path can be relative and makes uses of the $PATH if necessary. Override default with the ARGO_NODEOS_CONFIG environment variable. --perf PERF Path to the linux perf tool to use. This path can be relative and makes uses of the $PATH if necessary. Override default with the PERF environment variable. --argo_perf_wrapper ARGO_PERF_WRAPPER Path to the linux perf tool to use. This path can be relative and makes uses of the $PATH if necessary. Override default with the PERFWRAPPER environment variable. ```
-
- 10 Dec, 2018 2 commits
-
-
Valentin Reis authored
- added correct SIGINT/process ending handling to cmd - fixed kill/list containers - added ZMQ_LINGER 0 to the socket options.
-
Valentin Reis authored
Related to #22
-
- 28 Nov, 2018 5 commits
-
-
Swann Perarnau authored
Make it so that the daemon will delete containers when all commands it is aware of are finished, instead of relying on a single owner that needs to be tracked. This simplifies the handling to multiple commands in the same container, and should not impact the rest.
-
Swann Perarnau authored
Move the container start/exit events to the upstream pub/sub event stream. As these are more of a global event now that we support multiple commands in the same container, it makes sense to move them to the more general event stream. This patch also remove the code in cmd waiting for container start or exit, making (temporarily) the cmd unable to report power metrics. We will fix that in a later commit. This patch fixes complicated issues we had with how a job running multiple commands in the container might not all wait for the end of the container: now none of them do.
-
Swann Perarnau authored
Add a upstream pub client, to be able to listen to messages coming from the daemon on the upstream pub/sub channel. Doesn't support any fancy filter, as that's not used by the daemon so far.
-
Swann Perarnau authored
Ensure that the client that created the container is considered as the one owning it, with the consequence that if its command exits, the container is destroyed. Also deals with the race issue we had on the cmd side.
-
Swann Perarnau authored
Current code sends start/exit events when a container is created and process_start/process_exit when its already there. Instead, have the container start/exit only care about container stuff, and always sends the process start/exit events around. That makes the cmd run fsm easier to work out. Changes the message format a tiny bit. Fixes some missing stdout/stderr issues we had before.
-
- 23 Oct, 2018 1 commit
-
-
Sridutt Bhalachandra authored
Handles container with no power profiling enabled in the manifest file. In such cases the 'exit' response on process termination would generate TypeError.
-
- 21 Oct, 2018 2 commits
-
-
Swann Perarnau authored
Replace the fragile upstream communications with the new messaging layer, improving the stability and performance of this API. NOTE: this breaks previous clients NOTE: this patch is missing client tracking, to handle children signals.
-
Swann Perarnau authored
Abstracts away the exact wire format and client/server details, while changing the RPC side to work over ROUTER/DEALER sockets, as to avoid the lost messages issues we've been having with PUB/SUB for RPC.
-
- 17 Oct, 2018 1 commit
-
-
Sridutt Bhalachandra authored
Added multi- node and process support that will allow launching of multiple processes within a container. This is important for enabling use of NRM with MPI applications with multiple processes in a container and thus enabling multi-node executions See Issue #17
-
- 15 Aug, 2018 2 commits
-
-
Sridutt Bhalachandra authored
Added power profile data to the exit event response at the end of application run. The profile data is generated using functions from `SensorManager` and is obtained using the sensor update in `Daemon` See Issue #12
-
Sridutt Bhalachandra authored
Made changes in `SensorManager` to allow calculation of the difference in measured (stored) values using `rapl_reader` functions See Issue #12
-
- 14 Aug, 2018 1 commit
-
-
Sridutt Bhalachandra authored
Added profiling parameter to manifest file that can be used to specify if power profile is required for a container/application. Also, changed namedtuple "Container" key from "powerpolicy" to "power" and made changes to reflect this See Issue #12
-
- 10 Aug, 2018 5 commits
-
-
Sridutt Bhalachandra authored
Made changes in NRM to respond to phase_context event from [previously called power_policy event (874a6a4d)] from the application. The NRM can now store the informaton received on the event and call DDCM power policy through interfaces developed (Issue #11) in the control loop See Issue #10
-
Sridutt Bhalachandra authored
Fixed the C downstream_api messages sent by the application to NRM to be more generic. These messages can be used to invoke power or any other policies relying on contextual information from an application phase See Issue #10
-
Sridutt Bhalachandra authored
Added initialization of power policy using the manifest file and fixed appropriate TODOs in `PowerPolicyManager` (PPM) Class using container information (except adding new power policies). Also, fixed PPM to use None instead of NONE See Issue #10
-
Sridutt Bhalachandra authored
The container creation information (resources and powerpolicy) added to `ContainerManager` class can be used to intiliaze power policy parameters in the present and more in the future See Issue #10
-
Sridutt Bhalachandra authored
Added damper and slowdown parameters to manifest file that can be used to initiliaze power policy parameters See Issue #10
-
- 09 Aug, 2018 1 commit
-
-
Kamil Iskra authored
When invoking 'argo_nodeos_config run', we were passing the job environment implicitly. This wasn't very clean and was also causing problems with variables such as LD_PRELOAD, which were being filtered out because argo_nodeos_config is suid root.
-
- 25 Jul, 2018 2 commits
-
-
Sridutt Bhalachandra authored
The Power Policy Manager will allow invocation of all power policies by the NRM See Issue #11
-
Sridutt Bhalachandra authored
DDCM based power policy is aimed to mitigate workload imbalance in parallel applications that use barrier synchronizations (E.g. MPI). It reduces the duty cycle of CPUs not on the critical path of execution thereby reducing energy with little or no adverse impact on performance. See Issue #11
-
- 19 Jul, 2018 2 commits
-
-
Sridutt Bhalachandra authored
Added Duty Cycle module to set, reset and check Duty Cycle of a CPU. This module makes use of the MSR module See Issue #11
-
Sridutt Bhalachandra authored
Added Model Specific Register (MSR) module to allow access to MSRs. This module provides the interfaces to read and write msr through msr_safe kernel module. See Issue #11
-
- 17 Jul, 2018 2 commits
-
-
Sridutt Bhalachandra authored
On enabling the powepolicy manifest option and setting the policy parameter to any valid value (except "NONE") the application library providing contextual information is loaded using LD_PRELOAD See issue #10
-
Sridutt Bhalachandra authored
-
- 16 Jul, 2018 1 commit
-
-
Sridutt Bhalachandra authored
-
- 03 Jul, 2018 1 commit
-
-
Swann Perarnau authored
Trivial style corrections.
-
- 21 Dec, 2017 1 commit
-
-
Swann Perarnau authored
Small fixes to correct for wrong actions on power control over a real load.
-
- 20 Dec, 2017 2 commits
-
-
Swann Perarnau authored
Change the PowerActuator to be able to lower the power limit. Because RAPL doesn't provide an actual lower limit, we use 0 as the minimal power.
-
Swann Perarnau authored
This patch adds a poweractuator based on rapl settings available through the sensor manager. Adding this actuator forces us to use a list of actuators in the controller, changing a bit the structure of the code.
-