An Example of Hadoop MapReduce Counter

MapReduce Counter

Hadoop MapReduce Counter provides a way to measure the progress or the number of operations that occur within MapReduce programs. Basically, MapReduce framework provides a number of built-in counters to measure basic I/O operations, such as FILE_BYTES_READ/WRITTEN and Map/Combine/Reduce input/output records. These counters are very useful especially when you evaluate some MapReduce programs. Besides, the MapReduce Counter allows users to employ your own counters. Since MapReduce Counters are automatically aggregated over Map and Reduce phases, it is one of the easiest way to investigate internal behaviors of MapReduce programs. In this post, I’m going to introduce how to use your own MapReduce Counter. The example sources described in this post are based on Hadoop 0.21 API.

Incrementing your counter

For your own MapReduce counter, you first define a enum type as follow:

public static enum MATCH_COUNTER {
  INCOMING_GRAPHS,
  PRUNING_BY_NCV,
  PRUNING_BY_COUNT,
  PRUNING_BY_ISO,
  ISOMORPHIC
};

And then, when you want to increment your own counter, you should call the increment method as follows:

context.getCounter(MATCH_COUNTER.INCOMING_GRAPHS).increment(1);

You can access context instance within setup, cleanup, map, and reduce method in Mapper or Reducer class. You can get a desired counter via calling context.getCounter method with some enum value.

Finding your counter

You can get some Counters from a finished job as follows:

Configuration conf = new Configuration();
Cluster cluster = new Cluster(conf);
Job job = Job.getInstance(cluster,conf);
result = job.waitForCompletion(true);
...
Counters counters = job.getCounters();

The instance of Counters class contains all of the counters obtained from a job. So, when you want to get your own counter, you should call findCounter method with a enum type as follows:

Counter c1 = counters.findCounter(MATCH_COUNTER.INCOMING_GRAPHS);
System.out.println(c1.getDisplayName()+":"+c1.getValue());

The below example shows how to get built-in counter groups that Hadoop provides basically.

for (CounterGroup group : counters) {
  System.out.println("* Counter Group: " + group.getDisplayName() + " (" + group.getName() + ")");
  System.out.println("  number of counters in this group: " + group.size());
  for (Counter counter : group) {
    System.out.println("  - " + counter.getDisplayName() + ": " + counter.getName() + ": "+counter.getValue());
  }
}

3 Comments on “An Example of Hadoop MapReduce Counter”

  1. JavaPins says:

    An Example of Hadoop MapReduce Counter « Dive Into A Data Deluge…

    Thank you for submitting this cool story – Trackback from JavaPins…

  2. MapReduce says:

    MapReduce is a method for distributing a task across multiple nodes, Each node processes data stored on that node Where possible. Consists of two phases : 1. Map, 2. Reduce

    Features of MapReduce
    1. Automatic parallelization and distribution
    2. Fault-tolerance
    3. Status and monitoring tools
    4. A clean abstraction for programmers – MapReduce programs are usually written in Java
    5. MapReduce abstracts all the ‘housekeeping’ away from the developer – Developer can concentrate simply on writing the Map and Reduce functions

    Hyunsik Choi sir my concept of MapReduce is clear but in programming i face problem, how to overcome this, Any Book Suggestion ?


Leave a comment