1. 23 Oct, 2018 1 commit
  2. 21 Oct, 2018 2 commits
    • Swann Perarnau's avatar
      [refactor] replace upstream comms with msg layer · 0b0ab966
      Swann Perarnau authored
      Replace the fragile upstream communications with the new messaging
      layer, improving the stability and performance of this API.
      
      NOTE: this breaks previous clients
      NOTE: this patch is missing client tracking, to handle children signals.
      0b0ab966
    • Swann Perarnau's avatar
      [feature] add messaging layer for upstream API · c29ed7ea
      Swann Perarnau authored
      Abstracts away the exact wire format and client/server details, while
      changing the RPC side to work over ROUTER/DEALER sockets, as to avoid
      the lost messages issues we've been having with PUB/SUB for RPC.
      c29ed7ea
  3. 17 Oct, 2018 1 commit
    • Sridutt Bhalachandra's avatar
      [Feature] Multi- node and process support · 5a41baba
      Sridutt Bhalachandra authored
      Added multi- node and process support that will allow launching of
      multiple processes within a container. This is important for enabling
      use of NRM with MPI applications with multiple processes in a container
      and thus enabling multi-node executions
      
      See Issue #17
      5a41baba
  4. 15 Aug, 2018 2 commits
  5. 14 Aug, 2018 1 commit
  6. 10 Aug, 2018 5 commits
  7. 09 Aug, 2018 1 commit
    • Kamil Iskra's avatar
      Pass environment explicitly · 3fcf2f50
      Kamil Iskra authored
      When invoking 'argo_nodeos_config run', we were passing the job
      environment implicitly.  This wasn't very clean and was also causing
      problems with variables such as LD_PRELOAD, which were being filtered
      out because argo_nodeos_config is suid root.
      3fcf2f50
  8. 25 Jul, 2018 2 commits
  9. 19 Jul, 2018 2 commits
  10. 17 Jul, 2018 2 commits
  11. 16 Jul, 2018 1 commit
  12. 03 Jul, 2018 1 commit
  13. 21 Dec, 2017 1 commit
  14. 20 Dec, 2017 3 commits
    • Swann Perarnau's avatar
      [feature] Add actuator logic for decreasing power · 36206879
      Swann Perarnau authored
      Change the PowerActuator to be able to lower the power limit. Because
      RAPL doesn't provide an actual lower limit, we use 0 as the minimal
      power.
      36206879
    • Swann Perarnau's avatar
      [feature] Add PowerActuator and update control · 26e9c239
      Swann Perarnau authored
      This patch adds a poweractuator based on rapl settings available through
      the sensor manager. Adding this actuator forces us to use a list of
      actuators in the controller, changing a bit the structure of the code.
      26e9c239
    • Swann Perarnau's avatar
      [feature] Add actuator to the controller logic · cbbf2354
      Swann Perarnau authored
      This patch introduce one more level of abstraction to the controller:
      an actuator. Actuators will act as the middleman between specific
      managers and the controller, while providing enough info to implement
      actual models on top.
      
      For now, we only have the application threads actuator.
      cbbf2354
  15. 19 Dec, 2017 7 commits
    • Kamil Iskra's avatar
      Improve formatting and commentary · 41a91901
      Kamil Iskra authored
      41a91901
    • Swann Perarnau's avatar
      [refactor] Move control scheme to its own module · 246edb75
      Swann Perarnau authored
      The "control" part of the NRM is bound to change and become more complex
      in the near future, so move it in its own module.
      
      This refactor also introduce some controller logic. Control is split
      into 3 steps: planning, execution and updates. The goal is to use this
      new code organization as a way to abstract different control policies
      that could be implemented later.
      
      Note that we might at some point move into a "control manager" and a
      bunch of "policies" and "actuators", as a way of matching typical
      control theory vocabulary.
      246edb75
    • Kamil Iskra's avatar
      Configure perf-wrapper using the manifest · b666f1c2
      Kamil Iskra authored
      b666f1c2
    • Swann Perarnau's avatar
      [fix] Wrong streaming_callback on stderr · dec31967
      Swann Perarnau authored
      Fixes a copy/paste mistake on the name of the callback to trigger on
      stderr events.
      dec31967
    • Swann Perarnau's avatar
      [fix] Use proper env variable for container uuid · 90157c2a
      Swann Perarnau authored
      This patch fixes the daemon code to include the container uuid in the
      environment of the command, while changing that environment variable to
      use a better suited name.
      90157c2a
    • Swann Perarnau's avatar
      [feature] Replace client with dummy application · 66e4c85d
      Swann Perarnau authored
      This patch replace the client code (bin/client and nrm/client) by a new
      application code that integrates progress reports and uses the new
      downstream API.
      
      While git is reporting that both codes are different, the app code is
      basically a refactoring and adaptation of the client code.
      
      This is directly related to issue #2.
      66e4c85d
    • Swann Perarnau's avatar
      [feature] Implement Application Manager · f43a38d3
      Swann Perarnau authored
      This patch moves the tracking of applications clients of the downstream
      API into a ApplicationManager, that is able to track progress and thread
      management.
      
      This change is necessary in the long term to build a comprehensive
      downstream API and centralize the management of application tracking.
      
      Note that this tracking is currently independent of the container and
      pid tracking, and that might be a problem in the long term.
      f43a38d3
  16. 18 Dec, 2017 3 commits
    • Swann Perarnau's avatar
      [feature] Implement skeleton downstream API · 19c9eb54
      Swann Perarnau authored
      This patch refactors the downstream API to use pub/sub socket pair, like
      the upstream API. This is part of the effort to improve the downstream
      API. See #2.
      
      This patch doesn't touch the client module, which will be adapted in
      future commits.
      19c9eb54
    • Swann Perarnau's avatar
      [refactor] daemon should always bind on sockets · 1391a197
      Swann Perarnau authored
      The way 0MQ works on PUB/SUB sockets, publishers might drop
      messages if subscribers are not detected faster enough. One way to fix
      it is to have the "server" always bind sockets, and the "client" use
      connect. This way, the handshake is initiated properly, and the client
      can publish as soon as the connection is done.
      
      This patch makes the daemon bind on the upstream API and the CLI connect,
      fixing in the process the message dropping we were experiencing before.
      
      Long term, we might have a think of using 2 types of sockets for the
      upstream API: pub/sub for actual events published from the daemon, and
      a REQ/REP or ROUTER/DEALER pair for "commands".
      1391a197
    • Swann Perarnau's avatar
      Partial Revert of powercap API update · ec563afb
      Swann Perarnau authored
      Previous commit 0c93ce6a broke the
      sample code used by the daemon, by reverting the sample function to a
      json message generator. This is due to inconsistencies between the coolr
      code and the NRM import: we removed json generation from coolr, to push
      it on the messaging side, while upstream still does it on sensor
      reading.
      
      This commit fixes that, but doesn't touch the new test code embedded in
      clr_rapl.py
      We will move that the test infrastructure later.
      ec563afb
  17. 17 Dec, 2017 1 commit
  18. 15 Dec, 2017 2 commits
    • Swann Perarnau's avatar
      [feature] Properly handle run events in order · 957deb8d
      Swann Perarnau authored
      This patch implements a small finite state machine on the cmd side to be
      able to run a command, wait for all of its output, and then exit.
      
      As the daemon can send those message in any order, we need to wait them
      properly, in particular the closing of stdout/stderr before exiting.
      
      This patch also fixes the read_until_close callback creation to ensure
      that the stream EOF is handled as a distinct message.
      957deb8d
    • Swann Perarnau's avatar
      [refactor] Only track container inside the CM · f2bc8b80
      Swann Perarnau authored
      The daemon code was maintaining its own container tracker using pids,
      instead of using the one in the container manager. This patch removes
      this additional tracking, and let the daemon side deal with an actual
      namedtuple.
      f2bc8b80
  19. 14 Dec, 2017 2 commits
    • Swann Perarnau's avatar
      [feature] Add stdout/stderr streaming · 78f63cd4
      Swann Perarnau authored
      This patch adds stdout/stderr streaming capabilities, based on partial
      evaluation of a tornado.iostream callback. The bin/cmd CLI is updated to
      wait until an exit message, although that doesn't guaranty anything on
      message ordering...
      
      The next step is obviously to figure out a message flow that allows the
      CLI to send and receive the command IO properly, in order...
      78f63cd4
    • Swann Perarnau's avatar
      [refactor] Fix container namedtuple · 9afe59c7
      Swann Perarnau authored
      This patch propagates the process object into the container namedtuple,
      fix a couple of bad function calls and adapt the run command handler to
      use that process object instead of just the pid of it.
      9afe59c7