java.lang.Object
- uk.gov.gchq.gaffer.hdfs.operation.SampleDataForSplitPoints

All Implemented Interfaces:

Closeable, AutoCloseable, MapReduce, Operation
```
public class SampleDataForSplitPoints
extends Object
implements Operation, MapReduce
```
The SampleDataForSplitPoints operation is for creating a splits file, either for use in a SplitStoreFromFile operation or an AddElementsFromHdfs operation. This operation requires an input and output path as well as a path to a file to use as the resultingSplitsFile. For each input file you must also provide a MapperGenerator class name as part of a pair (input, mapperGeneratorClassName). In order to be generic and deal with any type of input file you also need to provide a JobInitialiser. JobInitialiser. NOTE - currently this job has to be run as a hadoop job.

See Also:

SampleDataForSplitPoints.Builder

Nested Class Summary

Nested Classes
Modifier and Type Class Description

static class SampleDataForSplitPoints.Builder
- Nested classes/interfaces inherited from interface uk.gov.gchq.gaffer.operation.Operation
  Operation.BaseBuilder<OP extends Operation,B extends Operation.BaseBuilder<OP,?>>

Constructor Summary

Constructors
Constructor Description

SampleDataForSplitPoints()

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method	Description
`String[]`	`getCommandLineArgs()`
`Class<? extends org.apache.hadoop.io.compress.CompressionCodec>`	`getCompressionCodec()`
`Map<String,String>`	`getInputMapperPairs()`
`JobInitialiser`	`getJobInitialiser()`	A job initialiser allows additional job initialisation to be carried out in addition to that done by the store.
`Integer`	`getMaxMapTasks()`
`Integer`	`getMaxReduceTasks()`
`Integer`	`getMinMapTasks()`
`Integer`	`getMinReduceTasks()`
`Integer`	`getNumMapTasks()`
`Integer`	`getNumSplits()`
`Map<String,String>`	`getOptions()`
`String`	`getOutputPath()`
`Class<? extends org.apache.hadoop.mapreduce.Partitioner>`	`getPartitioner()`
`float`	`getProportionToSample()`
`String`	`getSplitsFilePath()`
`boolean`	`isUseProvidedSplits()`
`boolean`	`isValidate()`
`void`	`setCommandLineArgs(String[] commandLineArgs)`
`void`	`setCompressionCodec(Class<? extends org.apache.hadoop.io.compress.CompressionCodec> compressionCodec)`
`void`	`setInputMapperPairs(Map<String,String> inputMapperPairs)`
`void`	`setJobInitialiser(JobInitialiser jobInitialiser)`
`void`	`setMaxMapTasks(Integer maxMapTasks)`
`void`	`setMaxReduceTasks(Integer maxReduceTasks)`
`void`	`setMinMapTasks(Integer minMapTasks)`
`void`	`setMinReduceTasks(Integer minReduceTasks)`
`void`	`setNumMapTasks(Integer numMapTasks)`
`void`	`setNumSplits(Integer numSplits)`
`void`	`setOptions(Map<String,String> options)`
`void`	`setOutputPath(String outputPath)`
`void`	`setPartitioner(Class<? extends org.apache.hadoop.mapreduce.Partitioner> partitioner)`
`void`	`setProportionToSample(float proportionToSample)`
`void`	`setSplitsFilePath(String splitsFilePath)`
`void`	`setUseProvidedSplits(boolean useProvidedSplits)`
`void`	`setValidate(boolean validate)`
`SampleDataForSplitPoints`	`shallowClone()`	Operation implementations should ensure a ShallowClone method is implemented.
`uk.gov.gchq.koryphe.ValidationResult`	`validate()`	Validates an operation.

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface uk.gov.gchq.gaffer.hdfs.operation.MapReduce
addInputMapperPair, addInputMapperPairs

Methods inherited from interface uk.gov.gchq.gaffer.operation.Operation
_getNullOrOptions, addOption, close, containsOption, getOption, getOption, validateRequiredFieldPresent

- Constructor Detail
  - SampleDataForSplitPoints
```
public SampleDataForSplitPoints()
```
- Method Detail
  - validate
```
public uk.gov.gchq.koryphe.ValidationResult validate()
```
    Description copied from interface: Operation
    
    Validates an operation. This should be used to validate that fields have been be configured correctly. By default no validation is applied. Override this method to implement validation.
    
    Specified by:
    
    validate in interface Operation
    
    Returns:
    
    validation result.
  - isValidate
```
public boolean isValidate()
```
  - setValidate
```
public void setValidate(boolean validate)
```
  - getSplitsFilePath
```
public String getSplitsFilePath()
```
    Specified by:
    
    getSplitsFilePath in interface MapReduce
  - setSplitsFilePath
```
public void setSplitsFilePath(String splitsFilePath)
```
    Specified by:
    
    setSplitsFilePath in interface MapReduce
  - getNumSplits
```
public Integer getNumSplits()
```
  - setNumSplits
```
public void setNumSplits(Integer numSplits)
```
  - getProportionToSample
```
public float getProportionToSample()
```
  - setProportionToSample
```
public void setProportionToSample(float proportionToSample)
```
  - getInputMapperPairs
```
public Map<String,String> getInputMapperPairs()
```
    Specified by:
    
    getInputMapperPairs in interface MapReduce
  - setInputMapperPairs
```
public void setInputMapperPairs(Map<String,String> inputMapperPairs)
```
    Specified by:
    
    setInputMapperPairs in interface MapReduce
  - getOutputPath
```
public String getOutputPath()
```
    Specified by:
    
    getOutputPath in interface MapReduce
  - setOutputPath
```
public void setOutputPath(String outputPath)
```
    Specified by:
    
    setOutputPath in interface MapReduce
  - getJobInitialiser
```
public JobInitialiser getJobInitialiser()
```
    Description copied from interface: MapReduce
    
    A job initialiser allows additional job initialisation to be carried out in addition to that done by the store. Most stores will probably require the Job Input to be configured in this initialiser as this is specific to the type of data store in Hdfs. For Avro data see AvroJobInitialiser. For Text data see TextJobInitialiser.
    
    Specified by:
    
    getJobInitialiser in interface MapReduce
    
    Returns:
    
    the job initialiser
  - setJobInitialiser
```
public void setJobInitialiser(JobInitialiser jobInitialiser)
```
    Specified by:
    
    setJobInitialiser in interface MapReduce
  - getNumMapTasks
```
public Integer getNumMapTasks()
```
    Specified by:
    
    getNumMapTasks in interface MapReduce
  - setNumMapTasks
```
public void setNumMapTasks(Integer numMapTasks)
```
    Specified by:
    
    setNumMapTasks in interface MapReduce
  - getMinMapTasks
```
public Integer getMinMapTasks()
```
    Specified by:
    
    getMinMapTasks in interface MapReduce
  - setMinMapTasks
```
public void setMinMapTasks(Integer minMapTasks)
```
    Specified by:
    
    setMinMapTasks in interface MapReduce
  - getMaxMapTasks
```
public Integer getMaxMapTasks()
```
    Specified by:
    
    getMaxMapTasks in interface MapReduce
  - setMaxMapTasks
```
public void setMaxMapTasks(Integer maxMapTasks)
```
    Specified by:
    
    setMaxMapTasks in interface MapReduce
  - getMinReduceTasks
```
public Integer getMinReduceTasks()
```
    Specified by:
    
    getMinReduceTasks in interface MapReduce
  - setMinReduceTasks
```
public void setMinReduceTasks(Integer minReduceTasks)
```
    Specified by:
    
    setMinReduceTasks in interface MapReduce
  - getMaxReduceTasks
```
public Integer getMaxReduceTasks()
```
    Specified by:
    
    getMaxReduceTasks in interface MapReduce
  - setMaxReduceTasks
```
public void setMaxReduceTasks(Integer maxReduceTasks)
```
    Specified by:
    
    setMaxReduceTasks in interface MapReduce
  - isUseProvidedSplits
```
public boolean isUseProvidedSplits()
```
    Specified by:
    
    isUseProvidedSplits in interface MapReduce
  - setUseProvidedSplits
```
public void setUseProvidedSplits(boolean useProvidedSplits)
```
    Specified by:
    
    setUseProvidedSplits in interface MapReduce
  - getPartitioner
```
public Class<? extends org.apache.hadoop.mapreduce.Partitioner> getPartitioner()
```
    Specified by:
    
    getPartitioner in interface MapReduce
  - setPartitioner
```
public void setPartitioner(Class<? extends org.apache.hadoop.mapreduce.Partitioner> partitioner)
```
    Specified by:
    
    setPartitioner in interface MapReduce
  - getCommandLineArgs
```
public String[] getCommandLineArgs()
```
    Specified by:
    
    getCommandLineArgs in interface MapReduce
  - setCommandLineArgs
```
public void setCommandLineArgs(String[] commandLineArgs)
```
    Specified by:
    
    setCommandLineArgs in interface MapReduce
  - getCompressionCodec
```
public Class<? extends org.apache.hadoop.io.compress.CompressionCodec> getCompressionCodec()
```
  - setCompressionCodec
```
public void setCompressionCodec(Class<? extends org.apache.hadoop.io.compress.CompressionCodec> compressionCodec)
```
  - getOptions
```
public Map<String,String> getOptions()
```
    Specified by:
    
    getOptions in interface Operation
    
    Returns:
    
    the operation options. This may contain store specific options such as authorisation strings or and other properties required for the operation to be executed. Note these options will probably not be interpreted in the same way by every store implementation.
  - setOptions
```
public void setOptions(Map<String,String> options)
```
    Specified by:
    
    setOptions in interface Operation
    
    Parameters:
    
    options - the operation options. This may contain store specific options such as authorisation strings or and other properties required for the operation to be executed. Note these options will probably not be interpreted in the same way by every store implementation.
  - shallowClone
```
public SampleDataForSplitPoints shallowClone()
```
    Description copied from interface: Operation
    
    Operation implementations should ensure a ShallowClone method is implemented. Performs a shallow clone. Creates a new instance and copies the fields across. It does not clone the fields. If the operation contains nested operations, these must also be cloned.
    
    Specified by:
    
    shallowClone in interface Operation
    
    Returns:
    
    shallow clone

Class SampleDataForSplitPoints

Nested Class Summary

Nested classes/interfaces inherited from interface uk.gov.gchq.gaffer.operation.Operation

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Methods inherited from interface uk.gov.gchq.gaffer.hdfs.operation.MapReduce

Methods inherited from interface uk.gov.gchq.gaffer.operation.Operation

Constructor Detail

SampleDataForSplitPoints

Method Detail

validate

isValidate

setValidate

getSplitsFilePath

setSplitsFilePath

getNumSplits

setNumSplits

getProportionToSample

setProportionToSample

getInputMapperPairs

setInputMapperPairs

getOutputPath

setOutputPath

getJobInitialiser

setJobInitialiser

getNumMapTasks

setNumMapTasks

getMinMapTasks

setMinMapTasks

getMaxMapTasks

setMaxMapTasks

getMinReduceTasks

setMinReduceTasks

getMaxReduceTasks

setMaxReduceTasks

isUseProvidedSplits

setUseProvidedSplits

getPartitioner

setPartitioner

getCommandLineArgs

setCommandLineArgs

getCompressionCodec

setCompressionCodec

getOptions

setOptions

shallowClone