Although most uses of MapReduce require just writing Map and Reduce functions, we have extended the basic model with a few features that we have found useful in practice.
Partitioning function
MapReduce users specify the number of reduce tasks/output files that they desire (R). Intermediate data gets partitioned across these tasks using a partitioning function on the intermediate key. A default partitioning function is provided that uses hashing (hash(key)% R) to evenly balance the data across the R partitions.
In some cases, however, it is useful to partition data by some other function of the key. For example, sometimes the output keys are URLs, and we want all entries for a single host to end up in the same output file. To support situations like this, the users of the MapReduce library can provide their own custom partitioning function. For example, using hash(Hostname(urlkey))% R as the partitioning function causes all URLs from the same host to end up in the same output file.
Ordering guarantees
Our MapReduce implementation sorts the intermediate data to group together all intermediate values that share the same intermediate key. Since many users find it convenient to have their Reduce function called on keys in sorted order, and we have already done all of the necessary work, we expose this to users by guaranteeing this ordering property in the interface to the MapReduce library.
Skipping bad records
Sometimes there are bugs in user code that cause the Map or Reduce functions to crash deterministically on certain records. Such bugs may cause a large MapReduce execution to fail after doing large amounts of computation. The preferred course of action is to fix the bug, but sometimes this is not feasible; for instance, the bug may be in a third-party library for which source code is not available. Also, it is sometimes acceptable to ignore a few records, such as when doing statistical analysis on a large data set. Thus, we provide an optional mode of execution where the MapReduce library detects records that cause deterministic crashes and skips these records in subsequent re-executions, in order to make forward progress.
Each worker process installs a signal handler that catches segmentation violations and bus errors. Before invoking a user Map or Reduce operation, the MapReduce library stores the sequence number of the record in a global variable. If the user code generates a signal, the signal handler sends a "last gasp" UDP packet that contains the sequence number to the MapReduce master. When the master has seen more than one failure on a particular record, it indicates that the record should be skipped when it issues the next re-execution of the corresponding Map or Reduce task.
A number of other extensions are discussed in a lengthier paper about MapReduce (see "Further Reading," below).