Class AddElementsFromHdfs

  • All Implemented Interfaces:
    Closeable, AutoCloseable, MapReduce, Operation

    public class AddElementsFromHdfs
    extends Object
    implements Operation, MapReduce
    An AddElementsFromHdfs operation is for adding Elements from HDFS. This operation requires an input, output and failure path. For each input file you must also provide a MapperGenerator class name as part of a pair (input, mapperGeneratorClassName). In order to be generic and deal with any type of input file you also need to provide a JobInitialiser. You will need to write your own MapperGenerator to convert the input data into gaffer Elements. This can be as simple as delegating to your ElementGenerator class, however it can be more complex and make use of the configuration in MapContext.

    For normal operation handlers the operation View will be ignored.

    NOTE - currently this job has to be run as a hadoop job.
    See Also:
    MapReduce, AddElementsFromHdfs.Builder
    • Constructor Detail

      • AddElementsFromHdfs

        public AddElementsFromHdfs()
    • Method Detail

      • getFailurePath

        public String getFailurePath()
      • setFailurePath

        public void setFailurePath​(String failurePath)
      • isValidate

        public boolean isValidate()
      • setValidate

        public void setValidate​(boolean validate)
      • getJobInitialiser

        public JobInitialiser getJobInitialiser()
        Description copied from interface: MapReduce
        A job initialiser allows additional job initialisation to be carried out in addition to that done by the store. Most stores will probably require the Job Input to be configured in this initialiser as this is specific to the type of data store in Hdfs. For Avro data see AvroJobInitialiser. For Text data see TextJobInitialiser.
        Specified by:
        getJobInitialiser in interface MapReduce
        Returns:
        the job initialiser
      • setUseProvidedSplits

        public void setUseProvidedSplits​(boolean useProvidedSplits)
        Specified by:
        setUseProvidedSplits in interface MapReduce
      • getPartitioner

        public Class<? extends org.apache.hadoop.mapreduce.Partitioner> getPartitioner()
        Specified by:
        getPartitioner in interface MapReduce
      • setPartitioner

        public void setPartitioner​(Class<? extends org.apache.hadoop.mapreduce.Partitioner> partitioner)
        Specified by:
        setPartitioner in interface MapReduce
      • getWorkingPath

        public String getWorkingPath()
      • setWorkingPath

        public void setWorkingPath​(String workingPath)
      • getOptions

        public Map<String,​String> getOptions()
        Specified by:
        getOptions in interface Operation
        Returns:
        the operation options. This may contain store specific options such as authorisation strings or and other properties required for the operation to be executed. Note these options will probably not be interpreted in the same way by every store implementation.
      • setOptions

        public void setOptions​(Map<String,​String> options)
        Specified by:
        setOptions in interface Operation
        Parameters:
        options - the operation options. This may contain store specific options such as authorisation strings or and other properties required for the operation to be executed. Note these options will probably not be interpreted in the same way by every store implementation.
      • shallowClone

        public AddElementsFromHdfs shallowClone()
        Description copied from interface: Operation
        Operation implementations should ensure a ShallowClone method is implemented. Performs a shallow clone. Creates a new instance and copies the fields across. It does not clone the fields. If the operation contains nested operations, these must also be cloned.
        Specified by:
        shallowClone in interface Operation
        Returns:
        shallow clone