Thanks, that clears things up considerably. I have to admit I don't use
screen and am just going by the man page, so the syntax may be off on my suggestions. But this should at least get you started.
You can write a script to kill the misbehaving daemon. Processes attached to a
screen session are just normal processes, and can be interacted with using normal commands like
ps and
kill.
Code:
#!/bin/sh
count = 0
# Signal 0 just tests if a process exists
while killall -s 0 DreamDaemon >/dev/null 2>&1 ; do
killall DreamDaemon >/dev/null 2>&1
count = $(( $count + 1 ))
# Exit with error if we have tried 10 times unsuccessfully
if [ $count -gt 10 ] ; then exit 1 ; fi
sleep 1
done
This will try 100 times to kill any process named "DreamDaemon" - if at any iteration of the loop there is no process named "DreamDaemon" running, it will exit successfully. If it reaches the 100th time, it gives up and exits with an error. This is to alert whoever is running the script (monit) that there was a problem.
If you want to kill the
screen session also, that requires something more fancy. It might be better to leave it running, so you can attach later and see if there was any worthwhile output. To kill both, maybe something like this:
Code:
#!/bin/sh
# Get process ID of first program named "DreamDaemon"
daemonpid = $(killall -v -s 0 DreamDaemon 2>&1 | sed 's/^.*(//
s/).*$//' | head -n 1)
# Get process ID of parent process (screen)
parentpid = $(ps -p $daemonpid -o ppid --no-headers)
count = 0
while kill -s 0 $daemonpid $parentpid >/dev/null 2>&1 ; do
kill $daemonpid $parentpid >/dev/null 2>&1
count = $(( $count + 1 ))
# Exit with error if we have tried 10 times unsuccessfully
if [ $count -gt 10 ] ; then exit 1 ; fi
sleep 1
done
A note of warning - this is vulnerable to a race condition. If the first kill attempt succeeds, but while the script is sleeping another process starts with either the daemon or parent PID, the script will kill that process (PIDs get reused). I wouldn't recommend putting this into production without some tweaking.
You can apparently also send commands to a running screen session using
screen -X command, but that's an exercise I'll leave to you.
You would also need a way to start the daemon; looks like something like
screen -d -m /path/to/DreamDaemon codebase.dmb -trusted -logself -core -thread on 12345 would work.
So when configuring monit, you would set the stop command to be your kill script, and the start command to something like what's in the paragraph above. monit can be configured to alert you if either command exits with an error status.
Edits: modified scripts to only try 10 times (if it hasn't responded by then, it likely never will) and added warning about second script.