Class FiltersToOperationConverter


  • public class FiltersToOperationConverter
    extends Object
    Converts a give View and array of Spark Filters to an operation that returns data with as many of the filters as possible converted to Gaffer filters and added to the view. This ensures that as much data as possible is filtered out by the store.
    • Constructor Detail

      • FiltersToOperationConverter

        public FiltersToOperationConverter​(View view,
                                           Schema schema,
                                           org.apache.spark.sql.sources.Filter... filters)
    • Method Detail

      • getOperation

        public Output<org.apache.spark.rdd.RDD<Element>> getOperation()
        Creates an operation to return an RDD in which as much filtering as possible has been carried out by Gaffer in Accumulo's tablet servers before the data is sent to a Spark executor.

        Note that when this is used within an operation to return a Dataframe, Spark will also carry out the filtering itself, and therefore it is not essential for all filters to be applied. As many as possible should be applied to reduce the amount of data sent from the data store to Spark's executors.

        The following logic is used to create an operation and a view which removes as much data as possible as early as possible: - If the filters specify a particular group or groups is/are required then the view should only contain those groups. - If the filters specify a particular value for the vertex, source or destination then an operation to return those directly is created (i.e. a GetRDDOfElements operation rather than a GetRDDOfAllElements operation). In this case the view is created to ensure that only entities or only edges are returned as appropriate. - Other filters are converted to Gaffer filters which are applied to the view.

        Returns:
        an operation to return the required data.