Package org.apache.avro.mapred
Class AvroMultipleInputs
java.lang.Object
org.apache.avro.mapred.AvroMultipleInputs
This class supports Avro-MapReduce jobs that have multiple input paths with a
different
Schema and AvroMapper for each path.
Usage:
Case 1: (ReflectData based inputs)
// Enable ReflectData usage across job. AvroJob.setReflect(job); Schema type1Schema = ReflectData.get().getSchema(Type1Record.class) AvroMultipleInputs.addInputPath(job, inputPath1, type1Schema, Type1AvroMapper.class);Where Type1AvroMapper would be implemented as
class Type1AvroMapper extends AvroMapper<Type1Record, Pair<ComparingKeyRecord, CommonValueRecord>>
Schema type2Schema = ReflectData.get().getSchema(Type2Record.class) AvroMultipleInputs.addInputPath(job, inputPath2, type2Schema, Type2AvroMapper.class);Where Type2AvroMapper would be implemented as
class Type2AvroMapper extends AvroMapper<Type2Record, Pair<ComparingKeyRecord, CommonValueRecord>>
Case 2: (SpecificData based inputs)
Schema type1Schema = Type1Record.SCHEMA$; AvroMultipleInputs.addInputPath(job, inputPath1, type1Schema, Type1AvroMapper.class);Where Type1AvroMapper would be implemented as
class Type1AvroMapper extends AvroMapper<Type1Record, Pair<ComparingKeyRecord, CommonValueRecord>>
Schema type2Schema = Type2Record.SCHEMA$; AvroMultipleInputs.addInputPath(job, inputPath2, type2Schema, Type2AvroMapper.class);Where Type2AvroMapper would be implemented as
class Type2AvroMapper extends AvroMapper<Type2Record, Pair<ComparingKeyRecord, CommonValueRecord>>
Note on InputFormat: The InputFormat used will always be
AvroInputFormat when using this class.
Note on collector outputs: When using this class, you will
need to ensure that the mapper implementations involved must all emit the
same Key type and Value record types, as set by
AvroJob.setOutputSchema(JobConf, Schema) or
AvroJob.setMapOutputSchema(JobConf, Schema).
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic voidaddInputPath(JobConf conf, Path path, Class<? extends AvroMapper> mapperClass, Schema inputSchema)
-
Constructor Details
-
AvroMultipleInputs
public AvroMultipleInputs()
-
-
Method Details
-
addInputPath
public static void addInputPath(JobConf conf, Path path, Class<? extends AvroMapper> mapperClass, Schema inputSchema) - Parameters:
conf- The configuration of the jobpath-Pathto be added to the list of inputs for the jobmapperClass-AvroMapperclass to use for this pathinputSchema-Schemato use for this path
-