Slashdot Mirror


Resources On Practical Job Scheduling?

felciano asks: "We have a fairly involved content build process that involves about 20 different jobs that have dependencies between them and are of sometimes radically different running times. Some jobs are parallelizable. We have a suite of machines across which we distribute the load right now, but the actual job-to-machine allocation is done by hand. I'm trying to do capacity and hardware purchase planning, and am getting a headache trying to model this. I've tried doing it in both MS Excel and MS Project, but neither seems suited for this type of model, especially when it comes to asking "what if" questions (e.g. if we bought 3 more machines, how much would that buy us in overall build time?). I realize that scheduling, network flow, etc. problems can get into into hairy (sometimes NP) problem spaces, but given the rise in popularity of Linux compute farms as ways to address scalability problems, I would assume that there are at least some basic tools to help model this. I've looked at distributed job-scheduling software and haven't found much on the automated side -- are there any practical toolkits, apps, libraries, etc. to help do this type of modeling and planning in other ways?"

1 of 32 comments (clear)

  1. Looks like my systems course really has a use. by Christopher+Thomas · · Score: 3

    This is the kind of job queueing theory was designed for. Pick up a textbook from your local university bookstore, if you're interested in the topic. This will let you fairly easily get estimates of how varying system parameters affect performance.

    I'm afraid I don't know what commercial packages handle this, though. We used a high-level system simulation tool called "MAP", but it wasn't very intuitive to use. Better solutions certainly exist.