Advanced Job Scheduling?
Kagato asks: "I'm trying to make my company's Unix boxes more mission critical in the area of job scheduling. Scheduling jobs in Unix has been around since the dawn of time. On most systems you have 'cron' and 'at' to provide most of your scheduling needs. But outside the basic world of 'do this at such time' there are a slew of commercial products that handle dependencies, failure routes, monitoring, dependent notification, etc. Commercial products of this type have been around for years. Is there anything like this available in the GNU and Open Source worlds? I've been looking at Freshmeat, SourceForge and Google. I've found the pickings for advanced scheduling are pretty slim."
Freshmeat? SoucreForge? or Google? Oh you have? Crap.
Why not fork?
Acts@core.mailboks.com Acrux@core.mailboks.com Adam@core.mailboks.com Adar@core.mailboks.com Ada@core.mailboks.com
There may be no open source products out there that match the functionality of the currently available commercial scheduling products.
So here's what you do. Get a dollar figure from management that represents just how "mission critical" job scheduling is at your work. That number becomes your scheduling budget.
If that number is too low to buy software, then I guess scheduling isn't all that critical at your business after all.
Unfortunately I've found job scheduling commercial software to often be less reliable then cron/at jobs. There are a few nice features available to them that is not available in cron/at.
1. the ability to list ancestors/descendant jobs, the first job(s) must complete before the next job is kicked off. Of course you must break up your job into smaller components.
2. cross platform scheduling, the ability to schedule jobs on more then one platform. I'm sure there are plenty of ways to schedule for jobs to be kicked off on NT or what not, but what about the mainframe?
3. central log maintence, if done correctly can keep the jobs in sync, which can be vitally important when you've got jobs that span your entire environment.
I really wish there was a unix based solution that encompassed all of these. There's a probably a good reason as to why there isn't an open source/"free" alternative for this process. The people who need it are less likely to use a free product. You're dealing with people so entrenched in archaic business practices, that it is difficult sometimes to authorize the use perl in your environment without going through weeks of business jutification.
It's easy to setup an existing framework to work correctly on Unix. Computer Associates has one. At times it seems to be the most bass-ackwards implementation I have seen, but then I have to remember it was originally designed for the mainframe.
We are currently finalizing a scheduling tool purchase for the company I work for. We have taken a look at the commercial job schedulers available and we are down to two that best fit our needs. #1 -Tidal Enterprise Scheduler and #2 - Job Scheduler. We are choosing them for a Windows 2000 platform but they all have Unix agents and other platforms available as well. Here the others we looked at:
ActiveBatch32
UC4
Unicenter Autosys Job Management
Control-M
I wish this was a post back in August.
Good Luck!
What I would really like to see is a HOWTO that gives a good overview of scheduling and clustering. Everything I have found so far is not so good.
Basically, what you need to do is use a shell script to wrap around the commands you are scheduling and call the shell script from crons instead. The shell script then takes responsibility for any error handling, email/SMS/pager notifications, failover, or whatever, based on return codes and error messages etc. I've usually found that for most sites it's possible to write a generic template script and a small set of support scripts that do the notifications and what not that cover >75% of crons with no major customisation beyond the exit code "case" statement and the command to be executed.
UNIX? They're not even circumcised! Savages!
GNU Queue offers batch scheduling for clusters of computers; however, a cluster only needs to contain a single computer.
One additional commercial tool we use where I work is Platform Computing's Load Sharing Facility. It works well, but it's expensive (read "over priced") and I suggest you try something else first.
I've found there is very little I can't do with cron and scripting - in fact I, like many others, have cron jobs that check up on other system processes.
This all worked well until I had cron die on a SCO box. I eventually figured out what job screwed it up but that screwed up everything else that cron managed and left me feeling rather uneasy about relying on cron (I mean, if it can be killed by an errant script...).
So...I've been considering launching cron from init with the respawn option to ensure that it stays running. Does anyone see a problem with this?
~~~~~~~
"You are not remembered for doing what is expected of you." - Atul Chitnis
FWIW, I'm in an environment that uses Autosys for intelligent scheduling. Seems to work pretty well. I really like the dependancies and all that you can set it. Of course, the only thing similar I used was cron, and this is light years better than cron when it comes to all the factors that were described in the article.