- 11 Apr, 2017 1 commit
-
-
Paul Rich authored
-
- 17 Feb, 2017 4 commits
- 16 Feb, 2017 2 commits
- 14 Feb, 2017 3 commits
- 27 Jan, 2017 1 commit
-
-
Paul Rich authored
-
- 25 Jan, 2017 2 commits
- 24 Jan, 2017 4 commits
-
-
Paul Rich authored
There was a change in the call and a behavior change to not try to redrain blocks. This was what the originally intended behavior was.
-
Paul Rich authored
-
Paul Rich authored
-
Paul Rich authored
Uncovered another bug while working on this, old behavior masked it, but there was a way to get a block to ignore the scheduled flag as well.
-
- 11 Jan, 2017 1 commit
-
-
Paul Rich authored
-
- 06 Jan, 2017 2 commits
- 05 Jan, 2017 2 commits
- 04 Jan, 2017 2 commits
-
-
Paul Rich authored
Fix for nodes getting hung up in cleanup-pending state A well (or poorly depending on how you look at it) qdel could cause Cobalt to put a node into cleanup but never complete the cleanup due to there being no ALPS backend reservation to clean up. This would clear if there were no jobs currently running, however, it would hang nodes otherwise. Closes #56 See merge request !26
-
Paul Rich authored
A well (or poorly depending on how you look at it) qdel could cause Cobalt to put a node into cleanup but never complete the cleanup due to there being no ALPS backend reservation to clean up. This would clear if there were no jobs currently running, however, it would hang nodes otherwise.
-
- 09 Dec, 2016 7 commits
-
-
Paul Rich authored
-
Paul Rich authored
-
Paul Rich authored
-
Paul Rich authored
Adding in build files for the RESERVATION_SUMMARY view for CDB Adding in templates and build script for the reservation summary view from Gabe West. Closes #52 See merge request !24
-
Paul Rich authored
Qsub path can now be specified for eLogin qsubs. This prevents us from getting a unwrapped qsub.py or the qsub being in a different location on the mom from the eLogin host when using qsub -I from a eLogin node on Cray systems. Closes #54 See merge request !25
-
Paul Rich authored
This prevents us from getting a unwrapped qsub.py or the qsub being in a different location on the mom from the eLogin host when using qsub -I from a eLogin node on Cray systems.
-
Paul Rich authored
-
- 08 Dec, 2016 6 commits
-
-
Paul Rich authored
-
Paul Rich authored
Fixing a possilbe pid leak if 'apbasil' task cleanup interrupted There was the possiblity of losign a PID if child cleanup was interrupted. This ensures retries until the child process is actually dead. Closes #46 See merge request !21
-
Paul Rich authored
If the child fetch succeeds but cleanup fails, make sure we use the intially fetched data, rahter than replacing it with the now potentially lost child data.
-
Paul Rich authored
-
Paul Rich authored
Resolve "Restart failure with system_script_forker using old statefile" Closes #53 See merge request !22
-
Paul Rich authored
After discussion the current algorithm for determining backfill time needs to be replaced and needs to depend on which blocks are selected for draining. This is a commit for the current algorithm's optimistic and pessimistic backfill modes.
-
- 06 Dec, 2016 1 commit
-
-
Paul Rich authored
New tests pending, but the optimistic mode backfiller does appear to be working properly. Old behavior is preserved and may be enabled by setting the mode to pessimistic.
-
- 05 Dec, 2016 1 commit
-
-
Paul Rich authored
Old versions of the forker do not have use_stdout_string that can casue _wait() to fail. Getting out of this would require deleting the statefile and restarting clean. To prevent that, the startup is being modified to add those key variables and initalizing them to being "unused".
-
- 02 Dec, 2016 1 commit
-
-
Paul Rich authored
-