linux 重启服务器脚本_无头Linux服务器的硬盘监视脚本-白红宇

linux 重启服务器脚本_无头Linux服务器的硬盘监视脚本

阅读量：2519 次

发布时间：2019-05-11

本文共 6282 字，大约阅读时间需要 20 分钟。

linux 重启服务器脚本

Modern Hard drives have an internal mechanism called S.M.A.R.T. through which it is possible to know when a hard disk is about to fail. Wouldn’t it be nice of the server to Email you before such a failure?

现代硬盘驱动器具有一种称为SMART的内部机制，通过它可以知道硬盘何时将发生故障。服务器在这样的故障之前给您发电子邮件不是很好吗？

总览 (Overview)

Programs like the “mdadm” (for software RAID management) and the “Palimpsest Disk Utility” (used on the Ubuntu LiveCD), use the S.M.A.R.T information to inform you when the disk is about to or has failed. However on a headless server (no GUI) there is no service that will inform you of the pending doom before it is too late. Moreover, how would you know about it without manually logging into the server?

诸如“ mdadm”(用于软件RAID管理)和“ Palimpsest Disk Utility”(在Ubuntu LiveCD上使用)之类的程序会使用SMART信息来通知您磁盘即将出现故障还是发生故障。但是，在无头服务器(无GUI)上，没有任何服务可以在太迟之前通知您即将发生的厄运。此外，如果不手动登录服务器，您怎么知道呢？

This script, when run once a day with cron, will alert if any of the system’s Hard Drives bad sectors count has reached a limit that is deliberately lower then “the disk is bad” threshold, and email the warning to the machine’s administrator.

该脚本每天与cron一起运行时，将警告系统的任何硬盘坏道计数是否已达到故意低于“磁盘坏”阈值的限制，并通过电子邮件将警告发送给计算机管理员。

先决条件和假设 (Prerequisites and assumptions)

You have already setup Email support for the server using the “” guide.
您已经使用“ ”指南为服务器设置了电子邮件支持。

You’re using a Debian based system.
您正在使用基于Debian的系统。

You’re not using a *hardware RAID controller.
您没有使用*硬件RAID控制器。

You will see me use VIM as the editor program, this is just because I’m used to it… you may use any other editor that you’d like.
您会看到我将VIM用作编辑器程序，这仅仅是因为我已经习惯了……您可以使用任何其他想要的编辑器。

*Because it is very possible that the hardware RAID controller blocks the system’s access to this information.

*因为硬件RAID控制器很可能会阻止系统访问此信息。

建立 (Setup)

Install the “smartmontools” package which reads the S.M.A.R.T information from the hard drive controller and presents it to us.

安装“ smartmontools”软件包，该软件包从硬盘驱动器控制器读取SMART信息并将其提供给我们。

sudo aptitude install smartmontools

sudo aptitude install smartmontools

Create the monitor script:

创建监视脚本：

sudo vim /root/smart-monitor.sh

sudo vim /root/smart-monitor.sh

Make this it’s content:

使它成为内容：

#!/bin/bash
########Email function######## email_admin_func() { echo "To: machine-admin@some-domain.com" > $temp_email_file echo "From: machine-name@some-domain.com" >> $temp_email_file echo "Subject: S.M.A.R.T monitor Threshold breached" >> $temp_email_file echo "" >> $temp_email_file echo -e $1 >> $temp_email_file /usr/sbin/ssmtp -t < $temp_email_file echo "Sent an Email to the Admin" }

#!/bin/bash
########Email function######## email_admin_func() { echo "To: machine-admin@some-domain.com" > $temp_email_file echo "From: machine-name@some-domain.com" >> $temp_email_file echo "Subject: SMART monitor Threshold breached" >> $temp_email_file echo "" >> $temp_email_file echo -e $1 >> $temp_email_file /usr/sbin/ssmtp -t < $temp_email_file echo "Sent an Email to the Admin" }

smartc_func()
{ /usr/sbin/smartctl -A /dev/$1 | grep Reallocated_Sector_Ct |tr -s ' '|cut -d' ' -f11 }

smartc_func()
{ /usr/sbin/smartctl -A /dev/$1 | grep Reallocated_Sector_Ct |tr -s ' '|cut -d' ' -f11 }

########End of Functions########

########End of Functions########

########Set working parameter########
temp_email_file=/tmp/smart_monitor.txt allowed_threshold=5 #set the amount of bad sectors your willing to live with, recommended 5.

########Set working parameter########
temp_email_file=/tmp/smart_monitor.txt allowed_threshold=5 #set the amount of bad sectors your willing to live with, recommended 5.

########Engine########
for i in sda sdb ; do # Add or subtract disk names from this list as appropriate for your setup. if [[ "`smartc_func $i`" -ge $allowed_threshold ]] ; then echo Emailing the Administrator email_admin_func "One of the HDs on "`hostname`", has reached the upper threshold limit!!! nThe threshold was set to:$allowed_threshold and the $i disk status was: "`smartc_func $i`"" fi done

########Engine########
for i in sda sdb ; do # Add or subtract disk names from this list as appropriate for your setup. if [[ "`smartc_func $i`" -ge $allowed_threshold ]] ; then echo Emailing the Administrator email_admin_func "One of the HDs on "`hostname`", has reached the upper threshold limit!!! nThe threshold was set to:$allowed_threshold and the $i disk status was: "`smartc_func $i`"" fi done

The key points to note are:

需要注意的关键点是：

Email function – Set the appropriate information like the machine name and administrator email.
电子邮件功能–设置适当的信息，例如机器名称和管理员电子邮件。

Allowed threshold – Set this parameter to what you feel is appropriate, I have used 5 because the limit set for the “server grade” hard drives i’v used was 10. (i’v found the threshold for “consumer grade” drives to be as high as 140).
允许的阈值–将此参数设置为您认为合适的参数，我使用5，因为我为“服务器级”硬盘设置的限制为10。(我发现“消费者级”硬盘的阈值为高达140)。

Set the devices that you want to monitor by adjusting the enumeration of disk names in the “for” loop. Currently two disks (sda & sdb) are included, so adjust for your setup. You may include all of your disks or just some, if you need to *exclude a disk for some reason.
通过调整“ for”循环中的磁盘名称枚举来设置要监视的设备。当前包含两个磁盘(sda和sdb)，因此请针对您的设置进行调整。如果由于某种原因需要*排除磁盘，则可以包括所有磁盘，也可以包括其中一些磁盘。

*in my original setup the first disk was a flash drive so reading its information if at all possible isn’t of much use.

*在我的原始设置中，第一个磁盘是闪存驱动器，因此，如果可能的话，读取它的信息没有多大用处。

Make the script executable:

使脚本可执行：

sudo chmod +x /root/smart-monitor.sh

sudo chmod +x /root/smart-monitor.sh

The setup is done.

设置完成。

安排脚本自动运行 (Schedule the script to be run automatically)

We want to make the script run automatically so we will create a new Cron job for it. As stated in the “” guide the upshot of doing so, is that if the script itself encounters an error, cron will automatically inform us via email as soon as it happens.

我们要使脚本自动运行，因此我们将为其创建一个新的Cron作业。如“ ”指南中所述，这样做的结果是，如果脚本本身遇到错误，则cron会在发生时立即通过电子邮件自动通知我们。

Open the cron job scheduler:

打开cron作业计划程序：