- be able to handle dependencies
- work on a workstation/laptop and a cluster using some middleware (I guess Oracle Grid Engine)
- remember the tasks it has already computed
- workflows/scripts should be nestable and multiple instances should be runable
Anyway, Mike suggested to look at ganga which is a job creation framework written in python. It supports different backends such as local hosts and SGE. It maintains state between invocations. Interestingly, it supports job trees which would nicely map onto the problem at hand. So, I think ganga needs to be seriously considered. It is licensed under the GPL which might be problematic.
Another project which looks interesting is jug. Another python framework for tying together collections of tasks. Tasks are coordinated via files in a particular directory. This works over NFS and can therefore be used by SGE. Workers can be added dynamically. I wonder if they can also be removed.
Finally, I also came across the wonderful GNU parallel program. It works similarly to xargs but will execute commands in parallel depending on the number of available cores (it also works with remote machines). This is brilliant for generating animations.