User Tools

Site Tools


servst

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
servst [2024/06/27 11:29] – created - external edit 127.0.0.1servst [2025/07/17 04:34] (current) zhangyk
Line 1: Line 1:
 <markdown> <markdown>
-gpust+servst
  
-[Gitlab](http://101.6.120.23:8081/zhangyuankuan/servst)+[Gitlab](https://www.frcbs.tsinghua.edu.cn/gitlab/zhangyuankuan/servst)
  
-## Description +## 1. Description 
-This is a set of tools to inspect the utilization of each server.+`servst` is primarily collection of tools designed to inspect the utilization  
 +status of each server. Then I add support for MD tasks, to show them or kill  
 +them conveniently and thoroughly.
  
-- `gpust` and `cpust` will help you get the status of server. +- `gpust` and `cpust` provide status for servers 
-- `gpurf` and `gpurfwill refresh the whole or just part of the servers.+- `gpurf` and `cpurf` refresh status information for all or specific servers 
 +- `lsmd`  and `lsmds` fetch the latest progress of MD tasks for servers 
 +- `killmd` kill target MD tasks
  
-## Installation+## 2. Installation
  
-You don't have to install anything actuallyI have add the commands as global  +There is no need for installationThe commands are available as global aliases.
-aliases.+
  
-BUT, if you want to use the refreshing commands, password-free configuration  +HOWEVER, password-free configuration is recommended for `gpurf`, `cpurf`  
-would be a must.+and `lsmds`
  
-## Usage+## 3. Usage
  
-For `gpustand `cpust`:+### 3.1 For gpust and cpust:
  
-Everyone can type the command on any server to get the  +These commands can be used on any server to get GPU or CPU status of any server. 
-status of any server's GPU cards. If askedtype the password of `rainbow.+If promptedenter the password of `rainbow`.
  
-For `gpurfand `cpurf`:+``` 
 ++---------------------------- GPU STATUS ----------------------------+ 
 +| Server: name of the server                                         | 
 +| Occupation: * for occupied GPU, 4 for free 4080 card, and so on.   | 
 +| processing servers:                                                | 
 +| yellow purple orange indigo gold green white red blue              | 
 +| 2024-07-10 10:00:01                                                | 
 ++--------------------------------------------------------------------+
  
-1. `gpurf` will update the status of all servers. If not having set up  +Server  Occupation      Last Updated 
-password-free configuration, one would be asked for password for 11 times...+blue    33              2024-07-10 10:00:01 
 +gold    3               2024-07-10 10:00:01 
 +green                 2024-07-10 10:00:01 
 +indigo  33              2024-07-10 10:00:01 
 +orange  33              2024-07-10 10:00:01 
 +purple  4*******        2024-07-10 10:00:01 
 +red     33              2024-07-10 10:00:01 
 +yellow  444*****        2024-07-10 10:00:01 
 +```
  
-2. `gpurf yellowwill update the status of just the wanted server `yellow`. You +``
-shall be able to log in it without password or just type the corresponding  ++---------------------------- CPU STATUS ----------------------------+ 
-password when asked. What about the other servers then? Don't worry. Their old  +| Server: name of the server                                         | 
-status will be shown.+| Total: total number of cores in one server                         | 
 +| Idle: average number of idle (not used) cores in the last 5 seconds| 
 +| processing servers:                                                | 
 +| yellow purple orange indigo gold green white violet black          | 
 +| 2024-07-10 10:00:01                                                | 
 ++--------------------------------------------------------------------+
  
 +Server  Total   Idle    Last Updated
 +black   56      28      2024-07-10 10:00:01
 +gold    56      32      2024-07-10 10:00:01
 +green   24      22      2024-07-10 10:00:01
 +indigo  56      15      2024-07-10 10:00:01
 +orange  56      31      2024-07-10 10:00:01
 +purple  96      10      2024-07-10 10:00:01
 +violet  56      32      2024-07-10 10:00:01
 +yellow  96      06      2024-07-10 10:00:01
 +```
 +
 +### 3.2 For gpurf and cpurf:
 +
 +1. `gpurf` updates the status of all servers. Without password-free configuration,
 +you'll need to enter the password for 11 times... So it's highly likely that
 +you will not use this one.
 +
 +2. `gpurf yellow` updates the status of merely the wanted server `yellow`. You
 +must either have password-free access or provide the password of `yellow` then 
 +`rainbow` when prompted. The statuses of the other servers will remain as 
 +previously displayed. The column `Last Updated` indicates updating time of each 
 +server.
 +
 +```
 ++---------------------------- GPU STATUS ----------------------------+
 +| Server: name of the server                                         |
 +| Occupation: * for occupied GPU, 4 for free 4080 card, and so on.   |
 +| processing servers:                                                |
 +| yellow purple orange indigo gold green white red blue              |
 +| 2024-07-10 10:00:01                                                |
 ++--------------------------------------------------------------------+
 +
 +Server  Occupation      Last Updated
 +blue    33              2024-07-10 10:00:01
 +gold    3               2024-07-10 10:00:01
 +green                 2024-07-10 10:00:01
 +indigo  33              2024-07-10 10:00:01
 +orange  33              2024-07-10 10:00:01
 +purple  4*******        2024-07-10 10:00:01
 +red     33              2024-07-10 10:00:01
 +white                 2024-07-10 10:00:01
 +yellow  444*****        2024-07-10 10:26:54
 +```
 +
 +### 3.3 For lsmd and lsmds
 +
 +1. `lsmd` will get MD tasks' status in the current server. `lsmds` will get the
 +statuses for several servers.
 +
 +2. `lsmds purple` will fetch the information of the wanted server `purple`.
 +
 +```
 +purple
 +/home/zhangyk/tmp9/5_ff14SB_1/8ubuild/5_run/run00094.nc
 +/home/zhangyk/tmp9/4_ff99SBildn_1/8ubuild/5_run/run00094.nc
 +/home/zhangyk/tmp9/6_ff19SB_1/8ubuild/5_run/run00180.nc
 +yellow
 +
 +red
 +/mnt/d4/zhangyk/tmp2/6_ff19SB_1/8ubuild/5_run/run00930.nc
 +blue
 +/home/zhangyk/tmp7/3_ff99SB_1/8ubuild/5_run/run00301.nc
 +/home/zhangyk/tmp6/4_ff99SBildn_1/8ubuild/5_run/run00516.nc
 +orange
 +/home/zhangyk/tmp6/3_ff99SB_1/8ubuild/5_run/run00374.nc
 +/home/zhangyk/tmp6/1_ff94_1/8ubuild/5_run/run00403.nc
 +indigo
 +/home/zhangyk/tmp7/2_ff99_1/8ubuild/5_run/run00378.nc
 +/home/zhangyk/tmp7/1_ff94_1/8ubuild/5_run/run00385.nc
 +gold
 +/home/zhangyk/tmp7/4_ff99SBildn_1/8ubuild/5_run/run00081.nc
 +green
 +/mnt/d8/zhangyk/tmp2/7_charmm22_1/8ubuild/5_run/run00964.nc
 +```
 +
 +### 3.4 For killmd
 +
 +Guide can be called by typing `killmd -h`.
 +```
 +usage: killmd.py [-h] [-a] [-p PID] [-g GPU]
 +
 +Kill series of md tasks instantly.
 +
 +options:
 +  -h, --help         show this help message and exit
 +  -a, --all          Kill all md tasks
 +  -p PID, --pid PID  Process ID to kill
 +  -g GPU, --gpu GPU  GPU id
 +```
 +
 +- `killmd -a` will kill all md tasks in the current server.
 +- `killmd -p 12345` will kill the task with pid `12345`, as well as related tasks
 +- `killmd -g 0` will kill tasks runing on `GPU 0`
 +
 +## 4. Acknowledgement
 +
 +Thanks to Prof. Xue and Zhewei Qiu. I optimized their script 
 +`checkamber.py` to get `lsmd.py`.
 </markdown> </markdown>
servst.1719487743.txt.gz · Last modified: 2024/06/27 11:29 by 127.0.0.1