Pages

Monday, February 1, 2010

Utilizing the Power of Batch Apex and Async Operations

If you have tried before to remove custom object records in bulk from your org, you know the hassle you need to go through to export the records into an excel file and then have the DataLoader delete those records.

Not only that, what if you need to perform further data integrity tasks before removing the records?

Mass updating, inserting and other similar scenarios like this can be more conveniently handled with force.com's Batch Apex. With Batch Apex you can now build complex, long-running processes on the platform. This feature is very useful for time to time data cleansing, archiving or data quality improvement operations.

One thing that you need to consider is that you should trigger the batch job using Apex code only and force.com by default does not provide scheduling feature your batchable Apex classes. In order to do that you need to write Apex class that implements a "Schedulable" interface.

The following example shows how you can utilize the model to mass delete records of any object in force.com platform.

In order to develop your Batch Apex, you need to create a new Apex class which extends "Database.Batchable" interface.

This interface demands for three methods to be implemented:
  • start
  • execute
  • finish
"start" method is called at the beginning of a batch Apex job. Use this method to collect the records (of objects) to be passed to the "execute" method for processing. The Apex engine, automatically breaks the massive numbers of records you selected into smaller batches and repeatedly calls the "execute" method until all records are processed.

The "finish" method is called once all the batches are processed. You can use this method to carry out any post-processing operation such as sending out an email confirmation on the status of the batch operation.

Let's take a closer look at each of these methods:


global Database.QueryLocator start(Database.BatchableContext BC) {
//passing the query string to the Database object.
return Database.getQueryLocator(query);
}



Use the Database.getQueryLocator in the "start" method to dynamically load data into the Batch Apex class. Using a Querylocator object, the governor limit for the total number of records retrieved from the database is bypassed.

Alternatively you can use the iterable when you need to create a complex scope for your batch job.

Execute method:

global void execute(Database.BatchableContext BC, List<sObject> scope) {

// in this sample, we simply delete all the records in scope
delete scope;
}



This method provides two parameters Database.BatchableContext and a list of records (referred to as "scope"). BatchableContext is generally used for tracking the progress of the batch job by all Batchable interface methods. We will use this class more in our "finish" method.

Finish method:

global void finish(Database.BatchableContext BC){
// Get the ID of the AsyncApexJob representing this batch job
// from Database.BatchableContext.
// Query the AsyncApexJob object to retrieve the current job's information.

AsyncApexJob a = [Select Id, Status, NumberOfErrors, JobItemsProcessed,
TotalJobItems, CreatedBy.Email
from AsyncApexJob where Id =:BC.getJobId()];
// Send an email to the Apex job's submitter notifying of job completion.

Messaging.SingleEmailMessage mail = new Messaging.SingleEmailMessage();
String[] toAddresses = new String[] {a.CreatedBy.Email};
mail.setToAddresses(toAddresses);
mail.setSubject('Apex Sharing Recalculation ' + a.Status);
mail.setPlainTextBody
('The batch Apex job processed ' + a.TotalJobItems +
' batches with '+ a.NumberOfErrors + ' failures.');
Messaging.sendEmail(new Messaging.SingleEmailMessage[] { mail });
}



By using the BatchableContext we are able to retrieve the jobId of our batch job. AsyncApexJob object in force.com allows you to gain access to the status of the async jobs as shown in the above example.

In this method I have utilized the Apex email library to send a notification to the owner of the batch job (whoever triggered the job in the first place).

Now, let's put it all together:


global class MassDeleteRecords implements  Database.Batchable<sObject> {

global final string query;

global MassDeleteRecords (String q)
{
query = q;
}

global Database.QueryLocator start(Database.BatchableContext BC){

return Database.getQueryLocator(query);
}

global void execute(Database.BatchableContext BC, List<sObject> scope){

delete scope;
}


global void finish(Database.BatchableContext BC){
// Get the ID of the AsyncApexJob representing this batch job
// from Database.BatchableContext.
// Query the AsyncApexJob object to retrieve the current job's information.

AsyncApexJob a = [Select Id, Status, NumberOfErrors, JobItemsProcessed,
TotalJobItems, CreatedBy.Email
from AsyncApexJob where Id =:BC.getJobId()];

// Send an email to the Apex job's submitter notifying of job completion.
Messaging.SingleEmailMessage mail = new Messaging.SingleEmailMessage();
String[] toAddresses = new String[] {a.CreatedBy.Email};
mail.setToAddresses(toAddresses);
mail.setSubject('Apex Sharing Recalculation ' + a.Status);
mail.setPlainTextBody('The batch Apex job processed ' + a.TotalJobItems +
' batches with '+ a.NumberOfErrors + ' failures.');

Messaging.sendEmail(new Messaging.SingleEmailMessage[] { mail });
}

}



Ok, that's as far as we go for the Batchable Apex class in this article.

Now let's write the code in Apex to run the class and test it:

String query = 'SELECT id, name FROM Account WHERE OwnerId = \'00520000000h57J\'';
MassDeleteRecords batchApex = new MassDeleteRecords(query );
ID batchprocessid = Database.executeBatch(batchApex);



The above code removes all accounts that the user id= 00520000000h57J owns them.

As it is apparent, now you can run the batch job from within your Visualforce pages, triggers (with caution), etc.

For governing limit and best practices documentation you can refer to the Apex Developer Guide.