Parallel Processing Shell Script
PPSS is a Bash shell script that executes in parallel
Parallel Processing Shell Script (PPSS) is a Bash shell script that executes commands, scripts, or programs in parallel. PPSS, which is a Google Code project, is designed to make full use of multicore processors. PPSS detects the number of available CPUs and starts separate jobs for each core. You can browse the source code here.
Most modern computers have at minimum two processor cores. However, most programs and tasks do not benefit from additiona cores because the software was written for sequential, single-core systems which are unaware of multiple cores.
According to the PPSS wiki, the idea behind PPSS is that, say, you have a large number of files and you want to perform some action on them. Instead of processing one file at at time, you want to process four files at a time since you have a quad-core processor. However, you also need a system that keeps track of separate jobs, starts new jobs when previous ones are finished and keep track of which files have been processed. This is what PPSS does.
PPSS's specific features include:
- Runs on any system that supports bash (although only tested on Linux and Mac OS X)
- Automatically detects the number of CPUs and CPU cores and starts a worker for each.
- Supports hyper-threading (if available).
- Output of individual processes are logged for inspection.
- Actions performed by PPSS are logged to a logfile for inspection.
- Takes a text file with one item per line.
- Can execute any command
- Can execute own scripts in parallel.
- If interrupted, will continue where it was left off, skipping processed files.
No comments:
Post a Comment