Posted by : Sushanth Thursday 24 December 2015


MapReduce jobs might take a long time to complete.That’s said, you might have to run your jobs in background, right ? You could have a look at Job tracker URL (for MR V1) or Yarn Resource manager (V2) in order to check job completion, but what if you could be notified once job is completed?.

A quick and dirty solution would be to poll JobTracker every X mn as follows

user@hadoop ~ $ hadoop job -status job_201211261732_3134
Job: job_201211261732_3134
file: hdfs://user/lihdop/.staging/job_201211261732_3134/job.xml
tracking URL: http://0.0.0.0:50030/jobdetails.jsp?jobid=job_201211261732_3134
map() completion: 0.0
reduce() completion: 0.0

Getting a notification instead of polling ?

In your driver class, only 3 lines would enable the callback feature of Hadoop
public class CallBackMR extends Configured implements Tool {

    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();
        CallBackMR callback = new CallBackMR();
        int res = ToolRunner.run(conf, callback, args);
        System.exit(res);
    }

    @Override
    public int run(String[] args) throws Exception {

        Configuration conf = this.getConf();

        // ==================================
        // Set the callback parameters
        conf.set("job.end.notification.url", "http://hadoopi.wordpress.com/api/hadoop/notification/$jobId?status=$jobStatus");
        conf.setInt("job.end.retry.attempts", 3);
        conf.setInt("job.end.retry.interval", 1000);
        // ==================================

        .../...

        // Submit your job in background
        job.submit();
    }

}
At job completion, an HTTP request will be sent to “job.end.notification.url” value. Can be retrieved from notification URL both the JOB_ID and JOB_STATUS.
Looking at Hadoop server side (see below logs from yarn), a notification SUCCEEDED has been sent every second, max 10 times before it officially failed (The URL I used here was obviously a fake one)

root@hadoopi:/usr/lib/hadoop/logs/userlogs/application_1379509275868_0002# find . -type f | xargs grep hadoopi
./container_1379509275868_0002_01_000001/syslog:2013-09-18 15:14:32,090 INFO [Thread-66] org.mortbay.log: Job end notification trying http://hadoopi.wordpress.com/api/hadoop/notification/job_1379509275868_0002?status=SUCCEEDED
./container_1379509275868_0002_01_000001/syslog:2013-09-18 15:14:32,864 WARN [Thread-66] org.mortbay.log: Job end notification to http://hadoopi.wordpress.com/api/hadoop/notification/job_1379509275868_0002?status=SUCCEEDED failed with code: 404 and message "Not Found"
./container_1379509275868_0002_01_000001/syslog:2013-09-18 15:14:32,965 INFO [Thread-66] org.mortbay.log: Job end notification trying http://hadoopi.wordpress.com/api/hadoop/notification/job_1379509275868_0002?status=SUCCEEDED
./container_1379509275868_0002_01_000001/syslog:2013-09-18 15:14:33,871 WARN [Thread-66] org.mortbay.log: Job end notification to http://hadoopi.wordpress.com/api/hadoop/notification/job_1379509275868_0002?status=SUCCEEDED failed with code: 404 and message "Not Found"
./container_1379509275868_0002_01_000001/syslog:2013-09-18 15:14:33,971 INFO [Thread-66] org.mortbay.log: Job end notification trying http://hadoopi.wordpress.com/api/hadoop/notification/job_1379509275868_0002?status=SUCCEEDED
./container_1379509275868_0002_01_000001/syslog:2013-09-18 15:14:34,804 WARN [Thread-66] org.mortbay.log: Job end notification to http://hadoopi.wordpress.com/api/hadoop/notification/job_1379509275868_0002?status=SUCCEEDED failed with code: 404 and message "Not Found"
./container_1379509275868_0002_01_000001/syslog:2013-09-18 15:14:34,904 INFO [Thread-66] org.mortbay.log: Job end notification trying http://hadoopi.wordpress.com/api/hadoop/notification/job_1379509275868_0002?status=SUCCEEDED
./container_1379509275868_0002_01_000001/syslog:2013-09-18 15:14:35,584 WARN [Thread-66] org.mortbay.log: Job end notification to http://hadoopi.wordpress.com/api/hadoop/notification/job_1379509275868_0002?status=SUCCEEDED failed with code: 404 and message "Not Found"
./container_1379509275868_0002_01_000001/syslog:2013-09-18 15:14:35,684 WARN [Thread-66] o
Note that the notification will be triggered for SUCCESS status but also for KILLED or FAILED statuses – that might be quite useful too.

Leave a Reply

Subscribe to Posts | Subscribe to Comments

- Copyright © Technical Articles - Skyblue - Powered by Blogger - Designed by Johanes Djogan -