Search before asking
Description
LogScanner.to_arrow_batch_reader() currently returns a synchronous pyarrow.RecordBatchReader that blocks the calling thread on each __next__() call. This might be acceptable for Arrow interop (e.g., feeding into DuckDB, Polars etc), but is not suitable for asyncio-native Python code.
So, ideally we should add an async counterpart, e.g., async for batch in scanner.read_batches() which yields RecordBatch objects without blocking the event loop.
Willingness to contribute
Search before asking
Description
LogScanner.to_arrow_batch_reader()currently returns a synchronouspyarrow.RecordBatchReaderthat blocks the calling thread on each__next__()call. This might be acceptable for Arrow interop (e.g., feeding into DuckDB, Polars etc), but is not suitable for asyncio-native Python code.So, ideally we should add an async counterpart, e.g.,
async for batch in scanner.read_batches()which yieldsRecordBatchobjects without blocking the event loop.Willingness to contribute