Interface MapReduce
-
- All Known Implementing Classes:
AddElementsFromHdfs
,SampleDataForSplitPoints
public interface MapReduce
ThisMapReduce
class should be implemented for any Operations that run map reduce jobs.JobInitialiser
.NOTE - currently this job has to be run as a hadoop job.
If you want to specify the number of mappers and/or the number of reducers then either set the exact number or set a min and/or max value.
- See Also:
MapReduce.Builder
-
-
Nested Class Summary
Nested Classes Modifier and Type Interface Description static interface
MapReduce.Builder<OP extends MapReduce,B extends MapReduce.Builder<OP,?>>
-
Method Summary
All Methods Instance Methods Abstract Methods Default Methods Modifier and Type Method Description default void
addInputMapperPair(String inputPath, String mapperGeneratorClassName)
default void
addInputMapperPairs(Map<String,String> inputMapperPairs)
String[]
getCommandLineArgs()
Map<String,String>
getInputMapperPairs()
JobInitialiser
getJobInitialiser()
A job initialiser allows additional job initialisation to be carried out in addition to that done by the store.Integer
getMaxMapTasks()
Integer
getMaxReduceTasks()
Integer
getMinMapTasks()
Integer
getMinReduceTasks()
Integer
getNumMapTasks()
String
getOutputPath()
Class<? extends org.apache.hadoop.mapreduce.Partitioner>
getPartitioner()
String
getSplitsFilePath()
boolean
isUseProvidedSplits()
void
setCommandLineArgs(String... commandLineArgs)
void
setInputMapperPairs(Map<String,String> inputMapperPairs)
void
setJobInitialiser(JobInitialiser jobInitialiser)
void
setMaxMapTasks(Integer maxMapTasks)
void
setMaxReduceTasks(Integer maxReduceTasks)
void
setMinMapTasks(Integer minMapTasks)
void
setMinReduceTasks(Integer minReduceTasks)
void
setNumMapTasks(Integer numMapTasks)
void
setOutputPath(String outputPath)
void
setPartitioner(Class<? extends org.apache.hadoop.mapreduce.Partitioner> partitioner)
void
setSplitsFilePath(String splitsFile)
void
setUseProvidedSplits(boolean useProvidedSplits)
-
-
-
Method Detail
-
addInputMapperPair
default void addInputMapperPair(String inputPath, String mapperGeneratorClassName)
-
getOutputPath
String getOutputPath()
-
setOutputPath
void setOutputPath(String outputPath)
-
getJobInitialiser
JobInitialiser getJobInitialiser()
A job initialiser allows additional job initialisation to be carried out in addition to that done by the store. Most stores will probably require the Job Input to be configured in this initialiser as this is specific to the type of data store in Hdfs. For Avro data seeAvroJobInitialiser
. For Text data seeTextJobInitialiser
.- Returns:
- the job initialiser
-
setJobInitialiser
void setJobInitialiser(JobInitialiser jobInitialiser)
-
getNumMapTasks
Integer getNumMapTasks()
-
setNumMapTasks
void setNumMapTasks(Integer numMapTasks)
-
getMinMapTasks
Integer getMinMapTasks()
-
setMinMapTasks
void setMinMapTasks(Integer minMapTasks)
-
getMaxMapTasks
Integer getMaxMapTasks()
-
setMaxMapTasks
void setMaxMapTasks(Integer maxMapTasks)
-
getMinReduceTasks
Integer getMinReduceTasks()
-
setMinReduceTasks
void setMinReduceTasks(Integer minReduceTasks)
-
getMaxReduceTasks
Integer getMaxReduceTasks()
-
setMaxReduceTasks
void setMaxReduceTasks(Integer maxReduceTasks)
-
isUseProvidedSplits
boolean isUseProvidedSplits()
-
setUseProvidedSplits
void setUseProvidedSplits(boolean useProvidedSplits)
-
getSplitsFilePath
String getSplitsFilePath()
-
setSplitsFilePath
void setSplitsFilePath(String splitsFile)
-
getPartitioner
Class<? extends org.apache.hadoop.mapreduce.Partitioner> getPartitioner()
-
setPartitioner
void setPartitioner(Class<? extends org.apache.hadoop.mapreduce.Partitioner> partitioner)
-
getCommandLineArgs
String[] getCommandLineArgs()
-
setCommandLineArgs
void setCommandLineArgs(String... commandLineArgs)
-
-