Big Data: Difference between Hadoop OLD API and NEW API

Monday, September 12, 2016

Difference between Hadoop OLD API and NEW API

Below are Difference between Hadoop OLD API (0.20) and New API (1.X or 2.X)

Diffrence	New API	OLD API
Mapper & Reducer	New API useing Mapper and Reducer asClass So can add a method (with a default implementation) to an abstract class without breaking old implementations of the class	IN OLD API used Mapper & Reduceer asInterface (still exist in New API as well)
Package	new API is in theorg.apache.hadoop.mapreduce package	old API can still be found inorg.apache.hadoop.mapred.
User Code to commnicate with MapReduce Syaterm	use “context” object to communicate with mapReduce system	JobConf, the *OutputCollector*, and theReporter object use for communicate with Map reduce System
Control Mapper and Reducer execution	new API allows both mappers and reducers to control the execution flow by overriding the run() method.	Controlling mappers by writing aMapRunnable, but no equivalent exists for reducers.
JOB control	Job control is done through the *JOB* classin New API	Job Control was done through JobClient (not exists in the new API)
Job Configuration	Job Configuration done throughConfiguration class via some of the helper methods on Job.	jobconf objet was use for Job configuration.which is extension of Configuration class. java.lang.Object extended by org.apache.hadoop.conf.Configuration extended by org.apache.hadoop.mapred.JobConf
OutPut file Name	In the new API map outputs are namedpart-m-nnnnn, and reduce outputs are named part-r-nnnnn (where nnnnn is an integer designating the part number, starting from zero).	in the old API both map and reduce outputs are named part-nnnnn
reduce() method passes values	In the new API, the reduce() method passes values as a java.lang.Iterable	In the Old API, the reduce() method passes values as a java.lang.Iterator

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)