Bash scripting: check process running or not and kill it after certain time

I have been doing some bash scripting lately. I am totally noob when it comes to scripting due to the fact that I kinda hate programming. Still, anyone who have done Linux bash scripting would know it is very powerful. Especially if you know awk, it will be even more terror. Anyway, I have this problem that I needed to solve lately. So here goes:

I have a script that run lots rsync commands. I need to be able to kill the rsync command if it take too long to run. Most of the time this is due to the target server have slow connection issue.

To kill the rsync command, I need to know what process ID (PID) it is running as. In bash, you can find out the PID of current running script by adding this command below into the script:

test1.sh
#!/bin/bash
PID=$$
echo $PID

root:# ./test1.sh
21334

There is one problem with this script above (at least for my application). That is actually NOT what I need. Like I said, the script is running rsync command. The command above specifically $$ will only list out the PID of current script. However, I need the PID of the rsync command. Look at example below:

test2.sh 
#!/bin/bash
rsync --dry-run /home /tmp
PID=$$
echo $PID

root:# ./test2.sh
21355

The output process above actually show the PID of test2.sh script. However, I need the PID of the child of this script which is the rsync command. I have searched around and so far the only command that I found to do it is using $!. This $! will actually get the PID of the process that just got send into the background. Therefore we need to send the rsync script into the background.

test3.sh
#!/bin/bash
PID=$$
rsync -a --dry-run /home /tmp &
PID2=$!
echo $PID
echo $PID2

root:# ./test3.sh
21366
21367

There! You can see $PID2 is now showing the PID of the rsync process. If your /home folder have lots of files then the script will take some time to run. Even though you will get your prompt back, the rsync is still running in the background. In that case, you can open another ssh session and use command “ps aux | grep rsync” to see the actual PID which should tally with the value of $PID2.

Now let’s make things more complicated.

Why do I need this $PID2? That is so that I can kill the process if it is taking too long to run. I do not want this one process to be hogging the line while another rsync is waiting next in line to run.

Here is the last part of the script with everything thrown in:

test4.sh
#!/bin/bash
rsync -a --dry-run /home /tmp &
PID2=$!
count=0
waittime=30
while kill -0 $PID2 2> /dev/null
do
sleep 1
((count++))
if [ $count -gt $waittime ] ; then
kill -TERM $PID2 2> /dev/null
break
fi
done
wait ${PID2}
rcode=$?
echo "Return code is $rcode"

Ok, this is the complete code. Run it and see what happens. Play with the settings and see what happens. Again, the rsync will be send to the background however, a counter will be counting and waiting for the script to complete. Hence the do-loop. In the loop, there is a sleep command which actually need to wait 1 second each time. The loop will exit when either the process is complete or the waittime expired. In our example, process will be killed after 30 seconds of waiting. Once the loop is done, notice the command “wait ${PID2}”. This is needed to get the return code for the process ID. In my script, this return code will notify me if the rsync did not complete with success.

I like this script because it is smart and make good use of the limitation of bash scripting. It send the process to the background purely for getting it’s PID using $! command. There are other ways to get this but so far, I like this one as it is simple to understand and deals with less commands like awk and so on.

Reference: http://www.unix.com/shell-programming-scripting/20412-check-if-job-still-alive-killing-after-certain-walltime.html


So, what do you think ?