15 Commits

Author SHA1 Message Date
Ingo Oppermann
aa3a5b4978
Prevent panic if index is out of bounds 2024-10-31 12:17:53 +01:00
Ingo Oppermann
55015bcf6f
Read out GPU specs at util start 2024-10-30 17:12:29 +01:00
Ingo Oppermann
de9a30a108
Add internal mock modules 2024-10-29 14:55:55 +01:00
Ingo Oppermann
2ee7fa7e41
Make resources the only direct user of psutil 2024-10-29 12:25:39 +01:00
Ingo Oppermann
fbf62bf7e5
Remove Start() function, rename Stop() to Cancel() 2024-10-28 17:12:31 +01:00
Ingo Oppermann
412fbedea3
Make psutil a submodule of resources, remove default psutil 2024-10-28 16:13:13 +01:00
Ingo Oppermann
2dbe5b5685
Add GPU support 2024-10-24 15:08:26 +02:00
Ingo Oppermann
644185dd50
Merge branch 'vod' into psutil_gpu 2024-08-19 12:43:47 +02:00
Ingo Oppermann
d391e274d7
Fix wrong memory limit, add total memory, add cpu and memory consumed by core itself to node resources 2024-07-25 21:13:49 +02:00
Ingo Oppermann
7fa47a962a
Add basic nvidia-smi parser 2024-07-16 08:14:19 +02:00
Ingo Oppermann
480dbb7f53
Refactor cluster node code 2024-07-09 12:26:02 +02:00
Ingo Oppermann
022c5c1a6d
Emit warnings 2023-09-11 14:42:46 +02:00
Ingo Oppermann
51d8b30e8f
Fix MaxCPU and MaxMemory semantics
If a limit of 0 (or negative) is given for both cpu and memory, then
no limiting will be triggered. If any value between 1 and 100 (inclusive)
is given, then limiting will be triggered when that limit is reached.

I.e. giving a limit of 100 doesn't not mean unlimited.
2023-07-12 11:53:39 +02:00
Ingo Oppermann
519f39b217
Fix returning wrong value for HasLimits 2023-07-12 11:38:40 +02:00
Ingo Oppermann
fc03bf73a2
Make resource manager a main module and expose more details 2023-06-06 21:28:08 +02:00