765 Commits

Author SHA1 Message Date
Ingo Oppermann
644185dd50
Merge branch 'vod' into psutil_gpu 2024-08-19 12:43:47 +02:00
Ingo Oppermann
72a3b8c17d
Update detendencies 2024-08-19 12:03:34 +02:00
Ingo Oppermann
d6d39f162a
Adding a TODO 2024-07-26 12:38:21 +02:00
Ingo Oppermann
b9baa17b0c
Fix writing string to prelude tail 2024-07-26 11:47:40 +02:00
Ingo Oppermann
70a49f8bdb
Process []byte instread of string in parser 2024-07-26 11:31:47 +02:00
Ingo Oppermann
d391e274d7
Fix wrong memory limit, add total memory, add cpu and memory consumed by core itself to node resources 2024-07-25 21:13:49 +02:00
Ingo Oppermann
e54bb4ee7c
Remove debug dockerfile 2024-07-25 21:12:06 +02:00
Ingo Oppermann
6101585fd2
Extract linux specifc code from psutils for reading CPU times 2024-07-25 09:30:11 +02:00
Ingo Oppermann
46950372be
WIP: Optimize copy from io.Reader, allow to suggest file size 2024-07-24 15:40:28 +02:00
Ingo Oppermann
79791d190b
Optimize isDir function on memfs 2024-07-24 12:55:28 +02:00
Ingo Oppermann
0a74470d38
Don't mark processes as errNotEnoughResourcesForDeployment when budget has been used up 2024-07-24 12:54:45 +02:00
Ingo Oppermann
28e1325eb2
Prevent sending RESUME if process is already resumed 2024-07-23 16:07:48 +02:00
Ingo Oppermann
54b1fe8e86
Dump casbin, replace with own policy enforcer 2024-07-23 15:54:09 +02:00
Ingo Oppermann
879819f10f
Retrieve current process from leader, clone metadata, introduce new state 'deploying' 2024-07-22 16:58:57 +02:00
Ingo Oppermann
9e52f19a66
Introduce synchronize budget, experimental 2024-07-22 09:25:23 +02:00
Ingo Oppermann
85011cb947
Don't upload existing binaries 2024-07-19 16:03:37 +02:00
Ingo Oppermann
61f9de0dd2
Don't trim paths in normal build 2024-07-19 16:03:09 +02:00
Ingo Oppermann
64ab09f4fc
Fix warning 2024-07-19 16:02:42 +02:00
Ingo Oppermann
a3948b597d
Return uploaded process config 2024-07-19 16:02:17 +02:00
Ingo Oppermann
b160e604d2
Don't import metadata, leads to race condition 2024-07-19 16:01:50 +02:00
Ingo Oppermann
308f008969
Only compare configs if the process will get replaced 2024-07-19 16:00:45 +02:00
Ingo Oppermann
688450f341
Add nil checks, add NewTask function 2024-07-19 12:26:47 +02:00
Ingo Oppermann
72883d18d4
Remove bottlenecks in process handling, still some rough edges 2024-07-18 17:16:49 +02:00
Ingo Oppermann
8a28e2cf96
Update dependencies 2024-07-17 16:58:45 +02:00
Ingo Oppermann
15e1cd7b6f
Use puzpuzpuz/xsync.MapOf for tasks, abstract tasks 2024-07-17 16:54:26 +02:00
Ingo Oppermann
4d0eed092e
Return error from ClusterProcessList, remove ProcessFindNodeID 2024-07-17 16:50:39 +02:00
Ingo Oppermann
6f524f5991
Use store.ProcessGetNode function 2024-07-17 16:49:31 +02:00
Ingo Oppermann
db564de1f1
Use store.ProcessGetNode function 2024-07-17 16:49:09 +02:00
Ingo Oppermann
e12fb0be52
Fix cluster shutdown, limit parallel opstack worker 2024-07-17 16:48:33 +02:00
Ingo Oppermann
3df1075548
Add ProcessGetNode function 2024-07-17 16:47:00 +02:00
Ingo Oppermann
88739e3f7f
Cosmetics 2024-07-17 16:45:33 +02:00
Ingo Oppermann
f56c0dde14
Use alpine3.20 as base image 2024-07-17 16:44:29 +02:00
Ingo Oppermann
3becd86f60
Return errors 2024-07-17 16:43:41 +02:00
Ingo Oppermann
de1c42e969
Don't do a file listing if no patterns are defined 2024-07-16 14:40:45 +02:00
Ingo Oppermann
96f7d8030c
Disable locally persisting DB in cluster mode 2024-07-16 14:01:31 +02:00
Ingo Oppermann
7fa47a962a
Add basic nvidia-smi parser 2024-07-16 08:14:19 +02:00
Ingo Oppermann
3d78122053
Fix crash when updating unavailable node 2024-07-16 08:13:15 +02:00
Ingo Oppermann
b9796a46f2
Add test for type conversion 2024-07-12 09:00:47 +02:00
Ingo Oppermann
cb9ce6f1dc
Fix nil pointer dereference 2024-07-11 12:33:51 +02:00
Ingo Oppermann
7e90bb87ce
Allow to import report history for a process 2024-07-10 16:46:49 +02:00
Ingo Oppermann
480dbb7f53
Refactor cluster node code 2024-07-09 12:26:02 +02:00
Ingo Oppermann
28603aab98
Incorporate process throttling into deploy decision, fix bug in rebalance, parallelize opstack 2024-06-26 17:03:42 +02:00
Ingo Oppermann
ca177becfa
Fix tests 2024-06-24 17:37:04 +02:00
Ingo Oppermann
c032cdf5c7
Add API for setting node status, respect it in leader tasks 2024-06-24 16:50:15 +02:00
Ingo Oppermann
166e313642
Fix tests 2024-06-19 15:38:42 +02:00
Ingo Oppermann
a9d6b1ec49
Add API endpoints for relocating processes 2024-06-19 15:28:30 +02:00
Ingo Oppermann
de6a267fd4
Add operations to relocate processes 2024-06-18 16:50:59 +02:00
Ingo Oppermann
f5d9725a48
Return proper HTTP status on leave 2024-06-12 15:08:07 +02:00
Ingo Oppermann
1a64fddbb1
Allow cluster leave endpoint to remove any node in the cluster 2024-06-07 11:28:54 +02:00
Ingo Oppermann
0f344f1998
Allow to send leave request to any node for any node 2024-06-06 13:20:49 +02:00