Interface MapReduce

  • All Known Implementing Classes:
    AddElementsFromHdfs, SampleDataForSplitPoints

    public interface MapReduce
    This MapReduce class should be implemented for any Operations that run map reduce jobs. JobInitialiser.

    NOTE - currently this job has to be run as a hadoop job.

    If you want to specify the number of mappers and/or the number of reducers then either set the exact number or set a min and/or max value.

    See Also:
    MapReduce.Builder
    • Method Detail

      • setInputMapperPairs

        void setInputMapperPairs​(Map<String,​String> inputMapperPairs)
      • addInputMapperPairs

        default void addInputMapperPairs​(Map<String,​String> inputMapperPairs)
      • addInputMapperPair

        default void addInputMapperPair​(String inputPath,
                                        String mapperGeneratorClassName)
      • getOutputPath

        String getOutputPath()
      • setOutputPath

        void setOutputPath​(String outputPath)
      • getJobInitialiser

        JobInitialiser getJobInitialiser()
        A job initialiser allows additional job initialisation to be carried out in addition to that done by the store. Most stores will probably require the Job Input to be configured in this initialiser as this is specific to the type of data store in Hdfs. For Avro data see AvroJobInitialiser. For Text data see TextJobInitialiser.
        Returns:
        the job initialiser
      • setJobInitialiser

        void setJobInitialiser​(JobInitialiser jobInitialiser)
      • getNumMapTasks

        Integer getNumMapTasks()
      • setNumMapTasks

        void setNumMapTasks​(Integer numMapTasks)
      • getMinMapTasks

        Integer getMinMapTasks()
      • setMinMapTasks

        void setMinMapTasks​(Integer minMapTasks)
      • getMaxMapTasks

        Integer getMaxMapTasks()
      • setMaxMapTasks

        void setMaxMapTasks​(Integer maxMapTasks)
      • getMinReduceTasks

        Integer getMinReduceTasks()
      • setMinReduceTasks

        void setMinReduceTasks​(Integer minReduceTasks)
      • getMaxReduceTasks

        Integer getMaxReduceTasks()
      • setMaxReduceTasks

        void setMaxReduceTasks​(Integer maxReduceTasks)
      • isUseProvidedSplits

        boolean isUseProvidedSplits()
      • setUseProvidedSplits

        void setUseProvidedSplits​(boolean useProvidedSplits)
      • getSplitsFilePath

        String getSplitsFilePath()
      • setSplitsFilePath

        void setSplitsFilePath​(String splitsFile)
      • getPartitioner

        Class<? extends org.apache.hadoop.mapreduce.Partitioner> getPartitioner()
      • setPartitioner

        void setPartitioner​(Class<? extends org.apache.hadoop.mapreduce.Partitioner> partitioner)
      • getCommandLineArgs

        String[] getCommandLineArgs()
      • setCommandLineArgs

        void setCommandLineArgs​(String... commandLineArgs)