- 14 Dec, 2018 2 commits
-
-
Swann Perarnau authored
Resolve "Bad relative path handling between command line client and daemon" Closes #8 See merge request !38
-
Valentin Reis authored
This is not as good as passing part of the manifest options forward, but it still fixes some of the practical problems when using the components together. The code makes sure that the manifests exists, though.
-
- 12 Dec, 2018 6 commits
-
-
Valentin Reis authored
[feature] commandline arguments, config file management. See merge request !37
-
Valentin Reis authored
Added integration tests to the `.gitlab-ci.yaml` file. These tests run any runner tagged "integration". They depend on the 'argotest' repository master branch being able to run the tests, which uses 'argopkgs' repository. The runner must be Nix enabled. This therefore currently uses a rolling release of the integration tests.
-
Valentin Reis authored
-
Valentin Reis authored
`cmd` now sends a container kill message to the upstream api and exits whenever it receives SIGINT, via C-c for instance.
-
Valentin Reis authored
Fixing a bug introduced by the 'progress-report' branch in a recent previous commit. The process object is the result of a tornado spawn, so the call has to be slightly different than what was there.
-
Valentin Reis authored
This commit adds a command-line interface to `daemon`: ``` usage: daemon [-h] [-c FILE] [-d] [--nrm_log NRM_LOG] [--hwloc HWLOC] [--argo_nodeos_config ARGO_NODEOS_CONFIG] [--perf PERF] [--argo_perf_wrapper ARGO_PERF_WRAPPER] optional arguments: -h, --help show this help message and exit -c FILE, --configuration FILE Specify a config json-formatted config file to override any of the available CLI options. If an option is actually provided on the command-line, it overrides its corresponding value from the configuration file. -d, --print_defaults Print the default configuration file. --nrm_log NRM_LOG Main log file. Override default with the NRM_LOG. environment variable --hwloc HWLOC Path to the hwloc to use. This path can be relative and makes uses of the $PATH if necessary. Override default with the HWLOC environment variable. --argo_nodeos_config ARGO_NODEOS_CONFIG Path to the argo_nodeos_config to use. This path can be relative and makes uses of the $PATH if necessary. Override default with the ARGO_NODEOS_CONFIG environment variable. --perf PERF Path to the linux perf tool to use. This path can be relative and makes uses of the $PATH if necessary. Override default with the PERF environment variable. --argo_perf_wrapper ARGO_PERF_WRAPPER Path to the linux perf tool to use. This path can be relative and makes uses of the $PATH if necessary. Override default with the PERFWRAPPER environment variable. ```
-
- 10 Dec, 2018 3 commits
-
-
Swann Perarnau authored
Improve process management See merge request !30
-
Valentin Reis authored
- added correct SIGINT/process ending handling to cmd - fixed kill/list containers - added ZMQ_LINGER 0 to the socket options.
-
Valentin Reis authored
Related to #22
-
- 28 Nov, 2018 7 commits
-
-
Swann Perarnau authored
Add a listen command to get access to the event stream of the upstream pub/sub API. This patch gives back access from the command line to the power information of a container, including filtering the event stream to only have events relevent to this container. This changes the workflow a little bit for users, but should result in a cleaner access to profiling data in the future. Related to #18.
-
Swann Perarnau authored
Make it so that the daemon will delete containers when all commands it is aware of are finished, instead of relying on a single owner that needs to be tracked. This simplifies the handling to multiple commands in the same container, and should not impact the rest.
-
Swann Perarnau authored
Move the container start/exit events to the upstream pub/sub event stream. As these are more of a global event now that we support multiple commands in the same container, it makes sense to move them to the more general event stream. This patch also remove the code in cmd waiting for container start or exit, making (temporarily) the cmd unable to report power metrics. We will fix that in a later commit. This patch fixes complicated issues we had with how a job running multiple commands in the container might not all wait for the end of the container: now none of them do.
-
Swann Perarnau authored
Add a upstream pub client, to be able to listen to messages coming from the daemon on the upstream pub/sub channel. Doesn't support any fancy filter, as that's not used by the daemon so far.
-
Swann Perarnau authored
Ensure that the client that created the container is considered as the one owning it, with the consequence that if its command exits, the container is destroyed. Also deals with the race issue we had on the cmd side.
-
Swann Perarnau authored
Current code sends start/exit events when a container is created and process_start/process_exit when its already there. Instead, have the container start/exit only care about container stuff, and always sends the process start/exit events around. That makes the cmd run fsm easier to work out. Changes the message format a tiny bit. Fixes some missing stdout/stderr issues we had before.
-
Swann Perarnau authored
Previous merges let the cmd send an empty container uuid, resulting in some issues when the user doesn't provide one. Restore the previous behavior.
-
- 26 Nov, 2018 1 commit
-
-
Swann Perarnau authored
[Removing old code] removing old downstream API c code. See merge request !31
-
- 01 Nov, 2018 1 commit
-
-
Valentin Reis authored
-
- 23 Oct, 2018 2 commits
-
-
Swann Perarnau authored
[Fix] Handle container with no power profiling See merge request !29
-
Sridutt Bhalachandra authored
Handles container with no power profiling enabled in the manifest file. In such cases the 'exit' response on process termination would generate TypeError.
-
- 21 Oct, 2018 3 commits
-
-
Swann Perarnau authored
Improve Messaging layer See merge request !28
-
Swann Perarnau authored
Replace the fragile upstream communications with the new messaging layer, improving the stability and performance of this API. NOTE: this breaks previous clients NOTE: this patch is missing client tracking, to handle children signals.
-
Swann Perarnau authored
Abstracts away the exact wire format and client/server details, while changing the RPC side to work over ROUTER/DEALER sockets, as to avoid the lost messages issues we've been having with PUB/SUB for RPC.
-
- 17 Oct, 2018 2 commits
-
-
Swann Perarnau authored
[Feature] Multi- node and process support Closes #17 See merge request !27
-
Sridutt Bhalachandra authored
Added multi- node and process support that will allow launching of multiple processes within a container. This is important for enabling use of NRM with MPI applications with multiple processes in a container and thus enabling multi-node executions See Issue #17
-
- 06 Sep, 2018 2 commits
-
-
Swann Perarnau authored
[fix] Lock with Pipenv all packages See merge request !25
-
Swann Perarnau authored
Some packages were being pulled into the virtual environment through setup.py. It just happens that the new versions of pyzmq and tornado don't really play nice with the current daemon code. Mostly because of a "better" ioloop hijacking mecanism, that doesn't work for us. This patch moves the install requirements of the setup.py to Pipfile, and lock them to a working version. Note: the code still currently triggers #14.
-
- 15 Aug, 2018 3 commits
-
-
Swann Perarnau authored
Add application level power profiling support Closes #12 See merge request !24
-
Sridutt Bhalachandra authored
Added power profile data to the exit event response at the end of application run. The profile data is generated using functions from `SensorManager` and is obtained using the sensor update in `Daemon` See Issue #12
-
Sridutt Bhalachandra authored
Made changes in `SensorManager` to allow calculation of the difference in measured (stored) values using `rapl_reader` functions See Issue #12
-
- 14 Aug, 2018 1 commit
-
-
Sridutt Bhalachandra authored
Added profiling parameter to manifest file that can be used to specify if power profile is required for a container/application. Also, changed namedtuple "Container" key from "powerpolicy" to "power" and made changes to reflect this See Issue #12
-
- 13 Aug, 2018 2 commits
-
-
Swann Perarnau authored
[CI] ensure py.test is ran on runner with rapl See merge request !22
-
Swann Perarnau authored
Since our tests include checking that coolr can access the rapl interface, we should ensure that only runners having rapl will test the code.
-
- 10 Aug, 2018 5 commits
-
-
Swann Perarnau authored
Resolve "Integration of power policy in to NRM" Closes #10 See merge request !20
-
Sridutt Bhalachandra authored
Made changes in NRM to respond to phase_context event from [previously called power_policy event (874a6a4d)] from the application. The NRM can now store the informaton received on the event and call DDCM power policy through interfaces developed (Issue #11) in the control loop See Issue #10
-
Sridutt Bhalachandra authored
Fixed the C downstream_api messages sent by the application to NRM to be more generic. These messages can be used to invoke power or any other policies relying on contextual information from an application phase See Issue #10
-
Sridutt Bhalachandra authored
Added initialization of power policy using the manifest file and fixed appropriate TODOs in `PowerPolicyManager` (PPM) Class using container information (except adding new power policies). Also, fixed PPM to use None instead of NONE See Issue #10
-
Sridutt Bhalachandra authored
The container creation information (resources and powerpolicy) added to `ContainerManager` class can be used to intiliaze power policy parameters in the present and more in the future See Issue #10
-