1. 09 Sep, 2016 1 commit
    • Paul Rich's avatar
      Merge branch '31-fix-startup-race' into 'master' · 37de4f0e
      Paul Rich authored
      This should effectively end the startup race condition
      
      This should get rid of the bulk of the 1234567 exit statuses. Forces a
      timeout.  The timeout goes away when the job is started.  This should
      fix the process group initilization/start gap.
      Closes #31
      
      See merge request !16
      37de4f0e
  2. 08 Sep, 2016 1 commit
  3. 01 Sep, 2016 4 commits
  4. 29 Aug, 2016 1 commit
  5. 24 Aug, 2016 10 commits
  6. 23 Aug, 2016 1 commit
  7. 22 Aug, 2016 1 commit
  8. 15 Aug, 2016 2 commits
  9. 11 Aug, 2016 2 commits
    • Paul Rich's avatar
      Fix for the aggressive cleanup · 06c5d122
      Paul Rich authored
      The apid fetch wasn't restricting itself to the actual ALPS reservation.
      This was causing everything to get killed.
      06c5d122
    • Paul Rich's avatar
      Fixed error in recovering pgroups. · d9595cc8
      Paul Rich authored
      System component restart on the fly should be safe again.  We recover
      the process groups properly now.  Found this while testing other changes
      in the fix for aggressive cleanup.
      d9595cc8
  10. 08 Aug, 2016 1 commit
  11. 06 Aug, 2016 1 commit
  12. 03 Aug, 2016 2 commits
  13. 01 Aug, 2016 4 commits
  14. 31 Jul, 2016 1 commit
    • Paul Rich's avatar
      Admin down was not getting properly detected. · 93155c72
      Paul Rich authored
      Update node state was resetting an admin down.  Added an additional flag
      so we can differentiate between admin down and hardware down.
      
      If a node is marked down with an admin command, then no matter what, it
      will remain marked down.
      93155c72
  15. 29 Jul, 2016 1 commit
  16. 27 Jul, 2016 1 commit
    • Paul Rich's avatar
      apkill support added · 0fcbb56e
      Paul Rich authored
      Support for apkill added to kill user alps instnace in interactive jobs.
      Kachina testing pending.
      0fcbb56e
  17. 18 Jul, 2016 1 commit
    • Paul Rich's avatar
      Interactive cleanup now working. · 6af7cebf
      Paul Rich authored
      Resources for interactive jobs are now appropriately released.  There is
      still a known issue with currently running aprun instances.  That will
      be addressed in a further patch.
      6af7cebf
  18. 06 Jul, 2016 1 commit
  19. 24 Jun, 2016 2 commits
  20. 23 Jun, 2016 1 commit
  21. 13 Jun, 2016 1 commit