- 17 Oct, 2018 1 commit
-
-
Sridutt Bhalachandra authored
Added multi- node and process support that will allow launching of multiple processes within a container. This is important for enabling use of NRM with MPI applications with multiple processes in a container and thus enabling multi-node executions See Issue #17
-
- 15 Aug, 2018 1 commit
-
-
Sridutt Bhalachandra authored
Added power profile data to the exit event response at the end of application run. The profile data is generated using functions from `SensorManager` and is obtained using the sensor update in `Daemon` See Issue #12
-
- 14 Aug, 2018 1 commit
-
-
Sridutt Bhalachandra authored
Added profiling parameter to manifest file that can be used to specify if power profile is required for a container/application. Also, changed namedtuple "Container" key from "powerpolicy" to "power" and made changes to reflect this See Issue #12
-
- 10 Aug, 2018 3 commits
-
-
Sridutt Bhalachandra authored
Added initialization of power policy using the manifest file and fixed appropriate TODOs in `PowerPolicyManager` (PPM) Class using container information (except adding new power policies). Also, fixed PPM to use None instead of NONE See Issue #10
-
Sridutt Bhalachandra authored
The container creation information (resources and powerpolicy) added to `ContainerManager` class can be used to intiliaze power policy parameters in the present and more in the future See Issue #10
-
Sridutt Bhalachandra authored
Added damper and slowdown parameters to manifest file that can be used to initiliaze power policy parameters See Issue #10
-
- 17 Jul, 2018 1 commit
-
-
Sridutt Bhalachandra authored
On enabling the powepolicy manifest option and setting the policy parameter to any valid value (except "NONE") the application library providing contextual information is loaded using LD_PRELOAD See issue #10
-
- 21 Dec, 2017 1 commit
-
-
Swann Perarnau authored
Small fixes to correct for wrong actions on power control over a real load.
-
- 19 Dec, 2017 3 commits
-
-
Kamil Iskra authored
-
Kamil Iskra authored
-
Swann Perarnau authored
This patch fixes the daemon code to include the container uuid in the environment of the command, while changing that environment variable to use a better suited name.
-
- 15 Dec, 2017 1 commit
-
-
Swann Perarnau authored
The daemon code was maintaining its own container tracker using pids, instead of using the one in the container manager. This patch removes this additional tracking, and let the daemon side deal with an actual namedtuple.
-
- 14 Dec, 2017 5 commits
-
-
Swann Perarnau authored
This patch propagates the process object into the container namedtuple, fix a couple of bad function calls and adapt the run command handler to use that process object instead of just the pid of it.
-
Swann Perarnau authored
Use the new argo_nodeos_config --exec feature in development. Allow us to delegate fork+attach+exec to argo_nodeos_config, and simplifying the create command as a result. We use tornado.process to wrap this command, as we want to able to stream stdout/stderr in the future. This patch also misuse, the 'pid' field of the container namedtuple to save the tornado.process.Subprocess object itself, so some functions need to be adapted.
-
Swann Perarnau authored
The logging improvement patch missed a few calls.
-
Swann Perarnau authored
The logging module allow us to configure logging facilities once per process using basicConfig, and then to use globally defined, named, logger objects. This simplifies access to logger objects, their configuration and remove pointers from all objects. This patch refactor all the logging calls to use a single 'nrm' logger object, using those facilities.
-
Swann Perarnau authored
Implement an update allocation function to be able to update resource tracking when containers are created and deleted. The commit should make it easier to improve the resource manager later on.
-
- 13 Dec, 2017 3 commits
-
-
Swann Perarnau authored
This patch adds a command to kill the parent process of a container based on the container uuid, triggering the death of the container. The os.kill command interacts pretty badly with the custom built children handling, causing us to catch unwanted exceptions in an effort to keep the code running. The waitpid code was also missing a bit about catching children exiting because of signals, so we fixed that. At this point, two things should be paid attention to: - we don't distinguish properly between a container and a command. This will probably cause issues later, as it should be possible to launch multiple programs in the same container, and for partitions to survive the death of the parent process. - the message format is growing more complex, but without any component having strong ownership over it. This will probably cause stability issues in the long term, as the format complexifies and we lose track of the fields expected from everyone.
-
Swann Perarnau authored
This patch adds a very simple command to list the containers currently known by the NRM. There's no history or state tracking on the NRM, so the code is pretty simple. We expect that some of the container tracking doesn't need to be sent for such a command, so the listing also filters some of the fields. This patch also adds an 'event' field to container messages, as it would probably be needed further for other kind of operations.
-
Swann Perarnau authored
This patch refactor the resource management and hwloc code into a working, albeit very simple scheduling policy. Indeed, the previous code contained strong assumptions about the output of hwloc matching an Argo NodeOS configuration used during the previous phase of the project, that always contained enough CPUs and Mems to perform exclusive scheduling. The current version is simpler, but should work on more regular systems. The patch also improves code organization so that introducing more complex scheduling algorithms will be simpler. The testing of this code resulted in the discovery of simple bugs in the daemon children handling code, which should work now.
-
- 11 Dec, 2017 1 commit
-
-
Swann Perarnau authored
The Argus (globalos) launcher had prototype code to read a container manifest, create a container using Judi's code, and map resources using hwloc. This patch brings that code, almost intact, into the NRM repo. This code is quite ugly, and the resource mapping crashes if the kernel configuration isn't right. But it's still a good starting point, and we should be able to improve things little by little. One part in particular needs attention: SIGCHLD handling. We should think of using ioloop-provided facilities to avoid this mess. The patch also contains the associated CLI changes. Note: the messaging format is starting to be difficult to keep in check, as there's conversions and field checks all over the code. See #3 for a possible solution.
-