📜  MapReduce API

📅  最后修改于: 2020-12-03 01:44:10             🧑  作者: Mango

MapReduce API

在本节中,我们重点介绍MapReduce API。在这里,我们了解MapReduce编程中使用的类和方法。

MapReduce Mapper类别

在MapReduce中,Mapper类的作用是将输入键值对映射到一组中间键值对。它将输入记录转换为中间记录。

这些中间记录与给定的输出键相关联,并传递给Reducer作为最终输出。

映射器类的方法

void cleanup(Context context) This method called only once at the end of the task.
void map(KEYIN key, VALUEIN value, Context context) This method can be called only once for each key-value in the input split.
void run(Context context) This method can be override to control the execution of the Mapper.
void setup(Context context) This method called only once at the beginning of the task.

MapReduce Reducer类别

在MapReduce中,Reducer类的作用是减少中间值的集合。它的实现可以通过JobContext.getConfiguration()方法访问作业的Configuration。

减速器分类方法

void cleanup(Context context) This method called only once at the end of the task.
void map(KEYIN key, Iterable values, Context context) This method called only once for each key.
void run(Context context) This method can be used to control the tasks of the Reducer.
void setup(Context context) This method called only once at the beginning of the task.

MapReduce作业类别

Job类用于配置作业并提交。它还控制执行和查询状态。提交作业后,set方法将引发IllegalStateException。

工作类别的方法

Methods Description
Counters getCounters() This method is used to get the counters for the job.
long getFinishTime() This method is used to get the finish time for the job.
Job getInstance() This method is used to generate a new Job without any cluster.
Job getInstance(Configuration conf) This method is used to generate a new Job without any cluster and provided configuration.
Job getInstance(Configuration conf, String jobName) This method is used to generate a new Job without any cluster and provided configuration and job name.
String getJobFile() This method is used to get the path of the submitted job configuration.
String getJobName() This method is used to get the user-specified job name.
JobPriority getPriority() This method is used to get the scheduling function of the job.
void setJarByClass(Class c) This method is used to set the jar by providing the class name with .class extension.
void setJobName(String name) This method is used to set the user-specified job name.
void setMapOutputKeyClass(Class class) This method is used to set the key class for the map output data.
void setMapOutputValueClass(Class class) This method is used to set the value class for the map output data.
void setMapperClass(Class class) This method is used to set the Mapper for the job.
void setNumReduceTasks(int tasks) This method is used to set the number of reduce tasks for the job
void setReducerClass(Class class) This method is used to set the Reducer for the job.