Class AddElementsFromHdfs
- java.lang.Object
-
- uk.gov.gchq.gaffer.hdfs.operation.AddElementsFromHdfs
-
- All Implemented Interfaces:
Closeable,AutoCloseable,MapReduce,Operation
public class AddElementsFromHdfs extends Object implements Operation, MapReduce
AnAddElementsFromHdfsoperation is for addingElements from HDFS. This operation requires an input, output and failure path. For each input file you must also provide aMapperGeneratorclass name as part of a pair (input, mapperGeneratorClassName). In order to be generic and deal with any type of input file you also need to provide aJobInitialiser. You will need to write your ownMapperGeneratorto convert the input data into gafferElements. This can be as simple as delegating to yourElementGeneratorclass, however it can be more complex and make use of the configuration inMapContext.For normal operation handlers the operation
NOTE - currently this job has to be run as a hadoop job.Viewwill be ignored.- See Also:
MapReduce,AddElementsFromHdfs.Builder
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classAddElementsFromHdfs.Builder-
Nested classes/interfaces inherited from interface uk.gov.gchq.gaffer.operation.Operation
Operation.BaseBuilder<OP extends Operation,B extends Operation.BaseBuilder<OP,?>>
-
-
Constructor Summary
Constructors Constructor Description AddElementsFromHdfs()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description String[]getCommandLineArgs()StringgetFailurePath()Map<String,String>getInputMapperPairs()JobInitialisergetJobInitialiser()A job initialiser allows additional job initialisation to be carried out in addition to that done by the store.IntegergetMaxMapTasks()IntegergetMaxReduceTasks()IntegergetMinMapTasks()IntegergetMinReduceTasks()IntegergetNumMapTasks()Map<String,String>getOptions()StringgetOutputPath()Class<? extends org.apache.hadoop.mapreduce.Partitioner>getPartitioner()StringgetSplitsFilePath()StringgetWorkingPath()booleanisUseProvidedSplits()booleanisValidate()voidsetCommandLineArgs(String[] commandLineArgs)voidsetFailurePath(String failurePath)voidsetInputMapperPairs(Map<String,String> inputMapperPairs)voidsetJobInitialiser(JobInitialiser jobInitialiser)voidsetMaxMapTasks(Integer maxMapTasks)voidsetMaxReduceTasks(Integer maxReduceTasks)voidsetMinMapTasks(Integer minMapTasks)voidsetMinReduceTasks(Integer minReduceTasks)voidsetNumMapTasks(Integer numMapTasks)voidsetOptions(Map<String,String> options)voidsetOutputPath(String outputPath)voidsetPartitioner(Class<? extends org.apache.hadoop.mapreduce.Partitioner> partitioner)voidsetSplitsFilePath(String splitsFilePath)voidsetUseProvidedSplits(boolean useProvidedSplits)voidsetValidate(boolean validate)voidsetWorkingPath(String workingPath)AddElementsFromHdfsshallowClone()Operation implementations should ensure a ShallowClone method is implemented.-
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface uk.gov.gchq.gaffer.hdfs.operation.MapReduce
addInputMapperPair, addInputMapperPairs
-
Methods inherited from interface uk.gov.gchq.gaffer.operation.Operation
_getNullOrOptions, addOption, close, containsOption, getOption, getOption, validate, validateRequiredFieldPresent
-
-
-
-
Method Detail
-
getFailurePath
public String getFailurePath()
-
setFailurePath
public void setFailurePath(String failurePath)
-
isValidate
public boolean isValidate()
-
setValidate
public void setValidate(boolean validate)
-
getInputMapperPairs
public Map<String,String> getInputMapperPairs()
- Specified by:
getInputMapperPairsin interfaceMapReduce
-
setInputMapperPairs
public void setInputMapperPairs(Map<String,String> inputMapperPairs)
- Specified by:
setInputMapperPairsin interfaceMapReduce
-
getOutputPath
public String getOutputPath()
- Specified by:
getOutputPathin interfaceMapReduce
-
setOutputPath
public void setOutputPath(String outputPath)
- Specified by:
setOutputPathin interfaceMapReduce
-
getJobInitialiser
public JobInitialiser getJobInitialiser()
Description copied from interface:MapReduceA job initialiser allows additional job initialisation to be carried out in addition to that done by the store. Most stores will probably require the Job Input to be configured in this initialiser as this is specific to the type of data store in Hdfs. For Avro data seeAvroJobInitialiser. For Text data seeTextJobInitialiser.- Specified by:
getJobInitialiserin interfaceMapReduce- Returns:
- the job initialiser
-
setJobInitialiser
public void setJobInitialiser(JobInitialiser jobInitialiser)
- Specified by:
setJobInitialiserin interfaceMapReduce
-
getNumMapTasks
public Integer getNumMapTasks()
- Specified by:
getNumMapTasksin interfaceMapReduce
-
setNumMapTasks
public void setNumMapTasks(Integer numMapTasks)
- Specified by:
setNumMapTasksin interfaceMapReduce
-
getMinMapTasks
public Integer getMinMapTasks()
- Specified by:
getMinMapTasksin interfaceMapReduce
-
setMinMapTasks
public void setMinMapTasks(Integer minMapTasks)
- Specified by:
setMinMapTasksin interfaceMapReduce
-
getMaxMapTasks
public Integer getMaxMapTasks()
- Specified by:
getMaxMapTasksin interfaceMapReduce
-
setMaxMapTasks
public void setMaxMapTasks(Integer maxMapTasks)
- Specified by:
setMaxMapTasksin interfaceMapReduce
-
getMinReduceTasks
public Integer getMinReduceTasks()
- Specified by:
getMinReduceTasksin interfaceMapReduce
-
setMinReduceTasks
public void setMinReduceTasks(Integer minReduceTasks)
- Specified by:
setMinReduceTasksin interfaceMapReduce
-
getMaxReduceTasks
public Integer getMaxReduceTasks()
- Specified by:
getMaxReduceTasksin interfaceMapReduce
-
setMaxReduceTasks
public void setMaxReduceTasks(Integer maxReduceTasks)
- Specified by:
setMaxReduceTasksin interfaceMapReduce
-
isUseProvidedSplits
public boolean isUseProvidedSplits()
- Specified by:
isUseProvidedSplitsin interfaceMapReduce
-
setUseProvidedSplits
public void setUseProvidedSplits(boolean useProvidedSplits)
- Specified by:
setUseProvidedSplitsin interfaceMapReduce
-
getSplitsFilePath
public String getSplitsFilePath()
- Specified by:
getSplitsFilePathin interfaceMapReduce
-
setSplitsFilePath
public void setSplitsFilePath(String splitsFilePath)
- Specified by:
setSplitsFilePathin interfaceMapReduce
-
getPartitioner
public Class<? extends org.apache.hadoop.mapreduce.Partitioner> getPartitioner()
- Specified by:
getPartitionerin interfaceMapReduce
-
setPartitioner
public void setPartitioner(Class<? extends org.apache.hadoop.mapreduce.Partitioner> partitioner)
- Specified by:
setPartitionerin interfaceMapReduce
-
getCommandLineArgs
public String[] getCommandLineArgs()
- Specified by:
getCommandLineArgsin interfaceMapReduce
-
setCommandLineArgs
public void setCommandLineArgs(String[] commandLineArgs)
- Specified by:
setCommandLineArgsin interfaceMapReduce
-
getWorkingPath
public String getWorkingPath()
-
setWorkingPath
public void setWorkingPath(String workingPath)
-
getOptions
public Map<String,String> getOptions()
- Specified by:
getOptionsin interfaceOperation- Returns:
- the operation options. This may contain store specific options such as authorisation strings or and other properties required for the operation to be executed. Note these options will probably not be interpreted in the same way by every store implementation.
-
setOptions
public void setOptions(Map<String,String> options)
- Specified by:
setOptionsin interfaceOperation- Parameters:
options- the operation options. This may contain store specific options such as authorisation strings or and other properties required for the operation to be executed. Note these options will probably not be interpreted in the same way by every store implementation.
-
shallowClone
public AddElementsFromHdfs shallowClone()
Description copied from interface:OperationOperation implementations should ensure a ShallowClone method is implemented. Performs a shallow clone. Creates a new instance and copies the fields across. It does not clone the fields. If the operation contains nested operations, these must also be cloned.- Specified by:
shallowClonein interfaceOperation- Returns:
- shallow clone
-
-