Not a developer? Go to MovableType.com

Documentation

Schwartz Workers

TheSchwartz is an open source reliable job queue framework leveraged by Movable Type to perform a number of tasks in the background. Dispatching tasks to the background is an excellent way of offloading large, time consuming and resource intensive tasks to a framework that can be more easily managed and metered. Movable Type uses TheSchwartz for example for background publishing.

TheSchwartz however could be used to offload virtually any task to the background. The following is a trivial example of how to register a Schwartz Worker with Movable Type via the registry, and then how to implement that worker.

name: Example Plugin for Movable Type
id: Example
description: This plugin is an example plugin for Movable Type.
version: 1.2
task_workers:
    my_worker:
        label: "Publishes content."
        class: 'MT::Worker::Publish'

The Difference Between Scheduled Tasks and Schwartz Workers

Scheduled tasks are run only at specific intervals and are executed by Movable Type via the web interface and via run-periodic-tasks. They are executed regardless of whether there is any known work to do, e.g. “check every 5 minutes to see if there are any scheduled posts that need to be published.” In this way, scheduled tasks act like a mini-cron within Movable Type.

Schwartz workers are spawned exclusively by the run-periodic-tasks script. Furthermore, Schwartz workers are only invoked when there is a job for them to perform. In other words, Movable Type must explicitly place a “Schwartz Job” on the queue in order for a corresponding “Schwartz Worker” to wake up and perform the designated work.

When should I use a scheduled tasks vs a Schwartz worker?

The general rule of thumb is that you should use a Schwartz worker if you believe that the task at hand could ever take longer than 30 seconds, or any length of time for which you think it would give users the impression that the application was slow or sluggish. This is due to the fact that scheduled tasks can be triggered by a user using the application, causing a periodic slow down as it processes a scheduled task.

One trick some developers use is to use a scheduled task to add a Schwartz job to the queue. That way you get the best of both worlds: a tasks that is assured to be processed at a designated interval, but very little risk for that task to slow down the application.

Schwartz Worker Class

Below is a sample Schwartz worker class. A worker extends the base class of TheSchwartz::Worker which provides the basic interface for workers in general. A Schwartz Worker is basically stateless - any state that typically needs to be maintained is done by TheSchwartz client, or dispatcher. For the most part, a worker need only define the following subroutines:

  • keep_exit_status_for - defines the number of seconds TheSwartz will keep a record of the exit status for this job. This allows you to keep a record around of past jobs for auditability and debugging purposes.
  • grab_for - the time a worker will be allocated to perform the associated work. If this time is exceeded the job will timeout and the job will be marked as a failure, allowing other workers to pick up the job and attempt work on it again.
  • max_retries - the maximum number of times to attempt performing this worker’s work. When this value is exceeded, the job is marked as completed (with errors) and is removed from the queue.
  • retry_delay - the number of seconds to wait before attempting to perform this worker’s work again.
  • work - performs the actual work associated with this worker.

The following is a shell of a Schwartz Worker class for Movable Type:

package MT::Worker::MyWorker;

use strict;
use base qw( TheSchwartz::Worker );

use TheSchwartz::Job;
use Time::HiRes qw(gettimeofday tv_interval);

sub keep_exit_status_for { 1 }
sub grab_for { 60 }
sub max_retries { 10 }
sub retry_delay { 60 }

sub work {
    my $class = shift;
    my TheSchwartz::Job $job = shift;
    my $s = MT::TheSchwartz->instance();
    my @jobs;
    push @jobs, $job;
    if (my $key = $job->coalesce) {
        while (my $job = $s->find_job_with_coalescing_value($class, $key)) {
            push @jobs, $job;
        }
    }
    foreach $job (@jobs) {
       # do something
    }
    return $job->completed();
}
1;

Adding a Schwartz Job to the Queue

To add a job to the queue for processing at the next opportunity, follow these steps:

  1. Instantiate a new TheSchwartz:::Job.
  2. Specify the class name of the worker for the job using $job->funcname.
  3. Provide an identifier of some kind for the worker to use in reference to the job at hand using $job->uniqkey. This might be for example the ID of an entry that needs to be published.
  4. Provide the priority of the task ranging from 1 (lowest) to 10 (highest) using $job->priority.
  5. Provide a key by which to group this job using $job->coalesce. Some workers will attempt to look for other similar or related jobs on the queue to process all at once in order to maximize efficiency.
  6. Finally, add the job to the queue using MT::TheSchwartz->insert($job).

For example:

require MT::TheSchwartz;
require TheSchwartz::Job;
my $job = TheSchwartz::Job->new();
$job->funcname('MT::Worker::Publish');
$job->uniqkey( $fi->id );
$job->priority( $priority );
$job->coalesce( $blog_id );
MT::TheSchwartz->insert($job);

Keep in mind, that ideally individual jobs should perform small units of work. If you find that you need to process a large amount of work, consider breaking the task into smaller units. Also, there is no guarantee to the order in which jobs are processes, so make sure each job can be run independently from any other. If you require jobs to be processed in a specific order, consider having a job insert another job to process to next unit of work just prior to completing.

See Also

To learn more about TheSchwartz, please consult the official documentation:

http://search.cpan.org/~bradfitz/TheSchwartz-1.07/
Back