Below are Difference between Hadoop OLD API (0.20) and New API (1.X or 2.X)
| Diffrence | New API | OLD API |
| Mapper & Reducer | New API useing Mapper and Reducer asClass So can add a method (with a default implementation) to an abstract class without breaking old implementations of the class | IN OLD API used Mapper & Reduceer asInterface (still exist in New API as well) |
| Package | new API is in theorg.apache.hadoop.mapreduce package | old API can still be found inorg.apache.hadoop.mapred. |
| User Code to commnicate with MapReduce Syaterm | use “context” object to communicate with mapReduce system | JobConf, the OutputCollector, and theReporter object use for communicate with Map reduce System |
| Control Mapper and Reducer execution | new API allows both mappers and reducers to control the execution flow by overriding the run() method. | Controlling mappers by writing aMapRunnable, but no equivalent exists for reducers. |
| JOB control | Job control is done through the JOB classin New API | Job Control was done through JobClient (not exists in the new API) |
| Job Configuration | Job Configuration done throughConfiguration class via some of the helper methods on Job. | jobconf objet was use for Job configuration.which is extension of Configuration class. java.lang.Object extended by org.apache.hadoop.conf.Configuration extended by org.apache.hadoop.mapred.JobConf |
| OutPut file Name | In the new API map outputs are namedpart-m-nnnnn, and reduce outputs are named part-r-nnnnn (where nnnnn is an integer designating the part number, starting from zero). | in the old API both map and reduce outputs are named part-nnnnn |
| reduce() method passes values | In the new API, the reduce() method passes values as a java.lang.Iterable | In the Old API, the reduce() method passes values as a java.lang.Iterator |
No comments:
Post a Comment