1. 11 Aug, 2017 3 commits
  2. 31 Jul, 2017 1 commit
  3. 28 Jul, 2017 1 commit
  4. 27 Jul, 2017 1 commit
  5. 10 Jul, 2017 1 commit
  6. 03 Jul, 2017 2 commits
    • Paul Rich's avatar
      Adding in test cases for _ALPS_reserve_resources · bedcb951
      Paul Rich authored
      In light of this bug, adding checks to make sure that we don't end up
      accidentally adding in bad values to reservations again.
      bedcb951
    • Paul Rich's avatar
      Fix for double-reservation entry · d72c6774
      Paul Rich authored
      This was traced to a call that could cause a non-string key to be added
      to the alps_reservation dictionary, resulting in a version of the
      reservation with an integer jobid key and a second with a string jobid
      key.  These should be keyed with strings.
      
      Added as further mitigation a check to see if there is an integer
      version of a key to clean.  If there is, then notify that it happened
      and clean that one, too.
      
      Triggering condition is an interactive job where the initial ALPS
      reservation times out.
      d72c6774
  7. 30 Jun, 2017 1 commit
  8. 27 Jun, 2017 2 commits
  9. 23 Jun, 2017 1 commit
  10. 19 Jun, 2017 3 commits
  11. 08 Jun, 2017 1 commit
  12. 18 May, 2017 1 commit
  13. 01 May, 2017 1 commit
  14. 14 Apr, 2017 1 commit
  15. 13 Apr, 2017 2 commits
  16. 12 Apr, 2017 1 commit
  17. 11 Apr, 2017 2 commits
  18. 10 Apr, 2017 1 commit
  19. 07 Apr, 2017 1 commit
  20. 24 Jan, 2017 2 commits
  21. 11 Jan, 2017 1 commit
  22. 05 Jan, 2017 1 commit
  23. 04 Jan, 2017 1 commit
    • Paul Rich's avatar
      Fix for nodes getting hung up in cleanup-pending state · 5f751a1a
      Paul Rich authored
      A well (or poorly depending on how you look at it) qdel could cause
      Cobalt to put a node into cleanup but never complete the cleanup due to
      there being no ALPS backend reservation to clean up.  This would clear
      if there were no jobs currently running, however, it would hang nodes
      otherwise.
      5f751a1a
  24. 08 Dec, 2016 2 commits
  25. 06 Dec, 2016 1 commit
  26. 05 Dec, 2016 1 commit
    • Paul Rich's avatar
      Fixing a potential startup error in the base forker · 3fe028b9
      Paul Rich authored
      Old versions of the forker do not have use_stdout_string that can casue
      _wait() to fail.  Getting out of this would require deleting the
      statefile and restarting clean.
      
      To prevent that, the startup is being modified to add those key
      variables and initalizing them to being "unused".
      3fe028b9
  27. 02 Dec, 2016 1 commit
  28. 28 Nov, 2016 3 commits