Class GafferRangePartitioner

  • All Implemented Interfaces:
    org.apache.hadoop.conf.Configurable

    public class GafferRangePartitioner
    extends org.apache.hadoop.mapreduce.Partitioner<org.apache.hadoop.io.Text,​org.apache.hadoop.io.Writable>
    implements org.apache.hadoop.conf.Configurable
    Copy of RangePartitioner but with a fix for opening the cut points file.
    • Constructor Detail

      • GafferRangePartitioner

        public GafferRangePartitioner()
    • Method Detail

      • getPartition

        public int getPartition​(org.apache.hadoop.io.Text key,
                                org.apache.hadoop.io.Writable value,
                                int numPartitions)
        Specified by:
        getPartition in class org.apache.hadoop.mapreduce.Partitioner<org.apache.hadoop.io.Text,​org.apache.hadoop.io.Writable>
      • getConf

        public org.apache.hadoop.conf.Configuration getConf()
        Specified by:
        getConf in interface org.apache.hadoop.conf.Configurable
      • setConf

        public void setConf​(org.apache.hadoop.conf.Configuration conf)
        Specified by:
        setConf in interface org.apache.hadoop.conf.Configurable
      • setSplitFile

        public static void setSplitFile​(org.apache.hadoop.mapreduce.Job job,
                                        String file)
        Sets the hdfs file name to use, containing a newline separated list of Base64 encoded split points that represent ranges for partitioning
        Parameters:
        job - the job
        file - the splits file
      • setNumSubBins

        public static void setNumSubBins​(org.apache.hadoop.mapreduce.Job job,
                                         int num)
        Sets the number of random sub-bins per range
        Parameters:
        job - the job
        num - the number of sub bins