Skip to content

Add an async API for spill file writing #23247

Description

@alamb

Is your feature request related to a problem or challenge?

Noticed while reviewing this PR from @pantShrey

SpillFile currently has an async-style API for reading spill data:

fn read_stream(&self) -> Result<Pin<Box<dyn Stream<Item = Result<Bytes>> + Send>>>;

but writing spill data is synchronous:

fn open_writer(&self) -> Result<Box<dyn SpillWriter>>;

pub trait SpillWriter: std::io::Write + Send {
    fn finish(&mut self) -> Result<()>;
}

This is somewhat awkward for spill backends that are naturally async, such as remote object stores. For example, when writing spill files to S3, GCS, Azure, or another object_store implementation, uploads are async operations, but the current SpillWriter API requires adapting them to a blocking std::io::Write interface.

I found this while working on an example showing how to write spill files to an object store:

Describe the solution you'd like

Add an async spill writing API so custom spill backends can write data without forcing async storage systems through a synchronous Write abstraction.

For example, DataFusion could introduce an async writer trait like

#[async_trait]
pub trait AsyncSpillWriter: Send {
    async fn write_all(&mut self, bytes: Bytes) -> Result<()>;
    async fn finish(&mut self) -> Result<()>;
}

or use an existing async I/O trait if there is a good fit.

Suggested steps:

  • Add an async spill writer abstraction to datafusion-execution
  • Add an async open-writer method to SpillFile
  • Update spill writing paths to use the async API where possible
  • Keep compatibility for existing local-file spill implementations, likely by adapting local file writes to the async API
  • Update the object store spill example in Add ObjectStore-backed TempFileFactor / spill example #23170 to use the new async API

Acceptance criteria:

  • Remote/object-store spill implementations do not need to buffer all spill bytes in memory just to bridge from synchronous Write to async upload APIs
  • Existing local disk spilling continues to work
  • The public API change is documented in the upgrading guide if needed

Describe alternatives you've considered

Keep the current synchronous SpillWriter API. This works well for local files, but makes remote spill backends harder to implement efficiently because they must either block on async uploads or buffer data and upload it later.

Add only an object-store-specific spill implementation. That would help one backend, but the underlying mismatch is in the spill file abstraction itself.

Additional context

Related PRs / comments:

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions