學校的計算電腦叢集的硬碟常常會卡住和產生錯誤
比方說 我用df指令時, 顯示計算硬碟掛載的目錄部分就會當住
dmesg 也會產生很多相關的錯誤
LustreError: Skipped 16 previous similar messages
Lustre: 4440:0:(import.c:517:import_select_connection())
wk2-OST0000-osc-ffff81042ee37000: trie d all
connections, increasing latency to 25s
Lustre: 4440:0:(import.c:517:import_select_connection()) Skipped 9 previous
similar messages
LustreError: 11-0: an error occurred while communicating with
192.168.170.233@o2ib. The ost_con nect operation
failed with -30
這類硬碟讀取問題要怎麼自動排除和檢測呢?
有一套sop流程嗎?
感謝~~~~~
--
All Comments