Skip to content

Add extension hook on ExecutionPlan for custom operators to participate in sort requirement pushdown #23276

Description

@zhuqi-lucas

Motivation

sort_pushdown.rs::pushdown_requirement_to_children in EnforceSorting walks down from a SortExec and pushes the sort requirement through built-in operators (UnionExec, FilterExec, projection, etc.). It has no extension point for custom operators.

Custom operators that could benefit from receiving pushed sort requirements, e.g. OneOfExec in datafusion-contrib/datafusion-materialized-views which represents a set of materialized view candidates and needs the requirement stamped as required_input_ordering to let cost-based selection differentiate MV candidates that natively satisfy the sort from those that would need a sort wrap, currently can't participate in this walk. Downstream projects have to write their own optimizer rule that duplicates most of upstream's logic.

Proposal

Add an opt-in method on ExecutionPlan:

/// If this operator can accept a sort requirement from its parent,
/// return the (possibly mutated) operator and the requirement that
/// should be pushed to its children.
///
/// Returning `None` stops the pushdown at this node (equivalent to
/// today's default behavior for non-built-in operators).
///
/// Default returns `None`.
fn try_accept_sort_requirement(
    &self,
    _requirement: &LexRequirement,
) -> Result<Option<(Arc<dyn ExecutionPlan>, LexRequirement)>> {
    Ok(None)
}

pushdown_requirement_to_children checks this method after the built-in cases and before falling through to a jump. A OneOfExec-style operator returns Some((self.with_required_input_ordering(req), req)) to record the requirement and keep descending.

Impact

  • Custom operators can join the sort-pushdown walk with ~10 LOC per operator instead of a full custom optimizer rule.
  • Downstream projects can delete their local sort-pushdown rules once they implement the trait method.
  • No behavior change for existing operators (default is opt-in no-op).

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions