1. 27 Nov, 2018 2 commits
    • Swann Perarnau's avatar
      [fix] ensure container has single owner · 31fcc32f
      Swann Perarnau authored
      Ensure that the client that created the container is considered as the
      one owning it, with the consequence that if its command exits, the
      container is destroyed. Also deals with the race issue we had on the cmd
      side.
      31fcc32f
    • Swann Perarnau's avatar
      [refactor/fix] always send process events for run · bb7af5d7
      Swann Perarnau authored
      Current code sends start/exit events when a container is created and
      process_start/process_exit when its already there. Instead, have the
      container start/exit only care about container stuff, and always sends
      the process start/exit events around. That makes the cmd run fsm easier
      to work out.
      
      Changes the message format a tiny bit.
      Fixes some missing stdout/stderr issues we had before.
      bb7af5d7
  2. 24 Oct, 2018 2 commits
    • Swann Perarnau's avatar
      [fix] ensure container has single owner · 926c8302
      Swann Perarnau authored
      Ensure that the client that created the container is considered as the
      one owning it, with the consequence that if its command exits, the
      container is destroyed. Also deals with the race issue we had on the cmd
      side.
      926c8302
    • Swann Perarnau's avatar
      [refactor/fix] always send process events for run · 11d667db
      Swann Perarnau authored
      Current code sends start/exit events when a container is created and
      process_start/process_exit when its already there. Instead, have the
      container start/exit only care about container stuff, and always sends
      the process start/exit events around. That makes the cmd run fsm easier
      to work out.
      
      Changes the message format a tiny bit.
      Fixes some missing stdout/stderr issues we had before.
      11d667db
  3. 23 Oct, 2018 1 commit
  4. 21 Oct, 2018 2 commits
    • Swann Perarnau's avatar
      [refactor] replace upstream comms with msg layer · 0b0ab966
      Swann Perarnau authored
      Replace the fragile upstream communications with the new messaging
      layer, improving the stability and performance of this API.
      
      NOTE: this breaks previous clients
      NOTE: this patch is missing client tracking, to handle children signals.
      0b0ab966
    • Swann Perarnau's avatar
      [feature] add messaging layer for upstream API · c29ed7ea
      Swann Perarnau authored
      Abstracts away the exact wire format and client/server details, while
      changing the RPC side to work over ROUTER/DEALER sockets, as to avoid
      the lost messages issues we've been having with PUB/SUB for RPC.
      c29ed7ea
  5. 17 Oct, 2018 1 commit
    • Sridutt Bhalachandra's avatar
      [Feature] Multi- node and process support · 5a41baba
      Sridutt Bhalachandra authored
      Added multi- node and process support that will allow launching of
      multiple processes within a container. This is important for enabling
      use of NRM with MPI applications with multiple processes in a container
      and thus enabling multi-node executions
      
      See Issue #17
      5a41baba
  6. 15 Aug, 2018 2 commits
  7. 14 Aug, 2018 1 commit
  8. 10 Aug, 2018 5 commits
  9. 09 Aug, 2018 1 commit
    • Kamil Iskra's avatar
      Pass environment explicitly · 3fcf2f50
      Kamil Iskra authored
      When invoking 'argo_nodeos_config run', we were passing the job
      environment implicitly.  This wasn't very clean and was also causing
      problems with variables such as LD_PRELOAD, which were being filtered
      out because argo_nodeos_config is suid root.
      3fcf2f50
  10. 25 Jul, 2018 2 commits
  11. 19 Jul, 2018 2 commits
  12. 17 Jul, 2018 2 commits
  13. 16 Jul, 2018 1 commit
  14. 03 Jul, 2018 1 commit
  15. 21 Dec, 2017 1 commit
  16. 20 Dec, 2017 3 commits
    • Swann Perarnau's avatar
      [feature] Add actuator logic for decreasing power · 36206879
      Swann Perarnau authored
      Change the PowerActuator to be able to lower the power limit. Because
      RAPL doesn't provide an actual lower limit, we use 0 as the minimal
      power.
      36206879
    • Swann Perarnau's avatar
      [feature] Add PowerActuator and update control · 26e9c239
      Swann Perarnau authored
      This patch adds a poweractuator based on rapl settings available through
      the sensor manager. Adding this actuator forces us to use a list of
      actuators in the controller, changing a bit the structure of the code.
      26e9c239
    • Swann Perarnau's avatar
      [feature] Add actuator to the controller logic · cbbf2354
      Swann Perarnau authored
      This patch introduce one more level of abstraction to the controller:
      an actuator. Actuators will act as the middleman between specific
      managers and the controller, while providing enough info to implement
      actual models on top.
      
      For now, we only have the application threads actuator.
      cbbf2354
  17. 19 Dec, 2017 7 commits
    • Kamil Iskra's avatar
      Improve formatting and commentary · 41a91901
      Kamil Iskra authored
      41a91901
    • Swann Perarnau's avatar
      [refactor] Move control scheme to its own module · 246edb75
      Swann Perarnau authored
      The "control" part of the NRM is bound to change and become more complex
      in the near future, so move it in its own module.
      
      This refactor also introduce some controller logic. Control is split
      into 3 steps: planning, execution and updates. The goal is to use this
      new code organization as a way to abstract different control policies
      that could be implemented later.
      
      Note that we might at some point move into a "control manager" and a
      bunch of "policies" and "actuators", as a way of matching typical
      control theory vocabulary.
      246edb75
    • Kamil Iskra's avatar
      Configure perf-wrapper using the manifest · b666f1c2
      Kamil Iskra authored
      b666f1c2
    • Swann Perarnau's avatar
      [fix] Wrong streaming_callback on stderr · dec31967
      Swann Perarnau authored
      Fixes a copy/paste mistake on the name of the callback to trigger on
      stderr events.
      dec31967
    • Swann Perarnau's avatar
      [fix] Use proper env variable for container uuid · 90157c2a
      Swann Perarnau authored
      This patch fixes the daemon code to include the container uuid in the
      environment of the command, while changing that environment variable to
      use a better suited name.
      90157c2a
    • Swann Perarnau's avatar
      [feature] Replace client with dummy application · 66e4c85d
      Swann Perarnau authored
      This patch replace the client code (bin/client and nrm/client) by a new
      application code that integrates progress reports and uses the new
      downstream API.
      
      While git is reporting that both codes are different, the app code is
      basically a refactoring and adaptation of the client code.
      
      This is directly related to issue #2.
      66e4c85d
    • Swann Perarnau's avatar
      [feature] Implement Application Manager · f43a38d3
      Swann Perarnau authored
      This patch moves the tracking of applications clients of the downstream
      API into a ApplicationManager, that is able to track progress and thread
      management.
      
      This change is necessary in the long term to build a comprehensive
      downstream API and centralize the management of application tracking.
      
      Note that this tracking is currently independent of the container and
      pid tracking, and that might be a problem in the long term.
      f43a38d3
  18. 18 Dec, 2017 3 commits
    • Swann Perarnau's avatar
      [feature] Implement skeleton downstream API · 19c9eb54
      Swann Perarnau authored
      This patch refactors the downstream API to use pub/sub socket pair, like
      the upstream API. This is part of the effort to improve the downstream
      API. See #2.
      
      This patch doesn't touch the client module, which will be adapted in
      future commits.
      19c9eb54
    • Swann Perarnau's avatar
      [refactor] daemon should always bind on sockets · 1391a197
      Swann Perarnau authored
      The way 0MQ works on PUB/SUB sockets, publishers might drop
      messages if subscribers are not detected faster enough. One way to fix
      it is to have the "server" always bind sockets, and the "client" use
      connect. This way, the handshake is initiated properly, and the client
      can publish as soon as the connection is done.
      
      This patch makes the daemon bind on the upstream API and the CLI connect,
      fixing in the process the message dropping we were experiencing before.
      
      Long term, we might have a think of using 2 types of sockets for the
      upstream API: pub/sub for actual events published from the daemon, and
      a REQ/REP or ROUTER/DEALER pair for "commands".
      1391a197
    • Swann Perarnau's avatar
      Partial Revert of powercap API update · ec563afb
      Swann Perarnau authored
      Previous commit 0c93ce6a broke the
      sample code used by the daemon, by reverting the sample function to a
      json message generator. This is due to inconsistencies between the coolr
      code and the NRM import: we removed json generation from coolr, to push
      it on the messaging side, while upstream still does it on sensor
      reading.
      
      This commit fixes that, but doesn't touch the new test code embedded in
      clr_rapl.py
      We will move that the test infrastructure later.
      ec563afb
  19. 17 Dec, 2017 1 commit