User Tools

Site Tools


servst

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
servst [2025/07/09 13:41] zhangykservst [2025/07/17 04:34] (current) zhangyk
Line 1: Line 1:
 <markdown> <markdown>
-gpust+servst
  
-[Gitlab](https://www.frcbs.tsinghua.edu.cn/gitlab/zhangyuankuan/servst#)+[Gitlab](https://www.frcbs.tsinghua.edu.cn/gitlab/zhangyuankuan/servst)
  
-## Description +## 1. Description 
-This is a set of tools to inspect the utilization of each server.+`servst` is primarily collection of tools designed to inspect the utilization  
 +status of each server. Then I add support for MD tasks, to show them or kill  
 +them conveniently and thoroughly.
  
-- `gpust` and `cpust` will help you get the status of server. +- `gpust` and `cpust` provide status for servers 
-- `gpurf` and `gpurfwill refresh the whole or just part of the servers.+- `gpurf` and `cpurf` refresh status information for all or specific servers 
 +- `lsmd`  and `lsmds` fetch the latest progress of MD tasks for servers 
 +- `killmd` kill target MD tasks
  
-## Installation+## 2. Installation
  
-You don't have to install anything actuallyI have add the commands as global  +There is no need for installationThe commands are available as global aliases.
-aliases.+
  
-BUT, if you want to use the refreshing commands, password-free configuration  +HOWEVER, password-free configuration is recommended for `gpurf`, `cpurf`  
-would be a must.+and `lsmds`
  
-## Usage+## 3. Usage
  
-For `gpustand `cpust`:+### 3.1 For gpust and cpust:
  
-Everyone can type the command on any server to get the  +These commands can be used on any server to get GPU or CPU status of any server. 
-status of any server's GPU cards. If askedtype the password of `rainbow`.+If promptedenter the password of `rainbow`.
  
-For `gpurfand `cpurf`:+``` 
 ++---------------------------- GPU STATUS ----------------------------+ 
 +| Server: name of the server                                         | 
 +| Occupation: * for occupied GPU, 4 for free 4080 card, and so on.   | 
 +| processing servers:                                                | 
 +| yellow purple orange indigo gold green white red blue              | 
 +| 2024-07-10 10:00:01                                                | 
 ++--------------------------------------------------------------------+
  
-1. `gpurf` will update the status of all servers. If not having set up  +Server  Occupation      Last Updated 
-password-free configuration, one would be asked for password for 11 times...+blue    33              2024-07-10 10:00:01 
 +gold    3               2024-07-10 10:00:01 
 +green                 2024-07-10 10:00:01 
 +indigo  33              2024-07-10 10:00:01 
 +orange  33              2024-07-10 10:00:01 
 +purple  4*******        2024-07-10 10:00:01 
 +red     33              2024-07-10 10:00:01 
 +yellow  444*****        2024-07-10 10:00:01 
 +```
  
-2. `gpurf yellowwill update the status of just the wanted server `yellow`. You +``
-shall be able to log in it without password or just type the corresponding  ++---------------------------- CPU STATUS ----------------------------+ 
-password when asked. What about the other servers then? Don't worry. Their old  +| Server: name of the server                                         | 
-status will be shown.+| Total: total number of cores in one server                         | 
 +| Idle: average number of idle (not used) cores in the last 5 seconds| 
 +| processing servers:                                                | 
 +| yellow purple orange indigo gold green white violet black          | 
 +| 2024-07-10 10:00:01                                                | 
 ++--------------------------------------------------------------------+
  
 +Server  Total   Idle    Last Updated
 +black   56      28      2024-07-10 10:00:01
 +gold    56      32      2024-07-10 10:00:01
 +green   24      22      2024-07-10 10:00:01
 +indigo  56      15      2024-07-10 10:00:01
 +orange  56      31      2024-07-10 10:00:01
 +purple  96      10      2024-07-10 10:00:01
 +violet  56      32      2024-07-10 10:00:01
 +yellow  96      06      2024-07-10 10:00:01
 +```
 +
 +### 3.2 For gpurf and cpurf:
 +
 +1. `gpurf` updates the status of all servers. Without password-free configuration,
 +you'll need to enter the password for 11 times... So it's highly likely that
 +you will not use this one.
 +
 +2. `gpurf yellow` updates the status of merely the wanted server `yellow`. You
 +must either have password-free access or provide the password of `yellow` then 
 +`rainbow` when prompted. The statuses of the other servers will remain as 
 +previously displayed. The column `Last Updated` indicates updating time of each 
 +server.
 +
 +```
 ++---------------------------- GPU STATUS ----------------------------+
 +| Server: name of the server                                         |
 +| Occupation: * for occupied GPU, 4 for free 4080 card, and so on.   |
 +| processing servers:                                                |
 +| yellow purple orange indigo gold green white red blue              |
 +| 2024-07-10 10:00:01                                                |
 ++--------------------------------------------------------------------+
 +
 +Server  Occupation      Last Updated
 +blue    33              2024-07-10 10:00:01
 +gold    3               2024-07-10 10:00:01
 +green                 2024-07-10 10:00:01
 +indigo  33              2024-07-10 10:00:01
 +orange  33              2024-07-10 10:00:01
 +purple  4*******        2024-07-10 10:00:01
 +red     33              2024-07-10 10:00:01
 +white                 2024-07-10 10:00:01
 +yellow  444*****        2024-07-10 10:26:54
 +```
 +
 +### 3.3 For lsmd and lsmds
 +
 +1. `lsmd` will get MD tasks' status in the current server. `lsmds` will get the
 +statuses for several servers.
 +
 +2. `lsmds purple` will fetch the information of the wanted server `purple`.
 +
 +```
 +purple
 +/home/zhangyk/tmp9/5_ff14SB_1/8ubuild/5_run/run00094.nc
 +/home/zhangyk/tmp9/4_ff99SBildn_1/8ubuild/5_run/run00094.nc
 +/home/zhangyk/tmp9/6_ff19SB_1/8ubuild/5_run/run00180.nc
 +yellow
 +
 +red
 +/mnt/d4/zhangyk/tmp2/6_ff19SB_1/8ubuild/5_run/run00930.nc
 +blue
 +/home/zhangyk/tmp7/3_ff99SB_1/8ubuild/5_run/run00301.nc
 +/home/zhangyk/tmp6/4_ff99SBildn_1/8ubuild/5_run/run00516.nc
 +orange
 +/home/zhangyk/tmp6/3_ff99SB_1/8ubuild/5_run/run00374.nc
 +/home/zhangyk/tmp6/1_ff94_1/8ubuild/5_run/run00403.nc
 +indigo
 +/home/zhangyk/tmp7/2_ff99_1/8ubuild/5_run/run00378.nc
 +/home/zhangyk/tmp7/1_ff94_1/8ubuild/5_run/run00385.nc
 +gold
 +/home/zhangyk/tmp7/4_ff99SBildn_1/8ubuild/5_run/run00081.nc
 +green
 +/mnt/d8/zhangyk/tmp2/7_charmm22_1/8ubuild/5_run/run00964.nc
 +```
 +
 +### 3.4 For killmd
 +
 +Guide can be called by typing `killmd -h`.
 +```
 +usage: killmd.py [-h] [-a] [-p PID] [-g GPU]
 +
 +Kill series of md tasks instantly.
 +
 +options:
 +  -h, --help         show this help message and exit
 +  -a, --all          Kill all md tasks
 +  -p PID, --pid PID  Process ID to kill
 +  -g GPU, --gpu GPU  GPU id
 +```
 +
 +- `killmd -a` will kill all md tasks in the current server.
 +- `killmd -p 12345` will kill the task with pid `12345`, as well as related tasks
 +- `killmd -g 0` will kill tasks runing on `GPU 0`
 +
 +## 4. Acknowledgement
 +
 +Thanks to Prof. Xue and Zhewei Qiu. I optimized their script 
 +`checkamber.py` to get `lsmd.py`.
 </markdown> </markdown>
servst.1752068460.txt.gz · Last modified: 2025/07/09 13:41 by zhangyk