Skip to content

Bug: Replicas low on memory on workerpool0 during data ingestion pipeline #43

Description

@GaetanSoulas

What happened?

Problem Encountered

When running the ingestion command:

agents-cli data-ingestion --remote

Several errors occurred on the Agent Platform pipelines with the following message:
Replicas low on memory: workerpool0. Specify a machine with larger memory and try again.

Description

The workerpool0 replicas are running out of memory specifically during the process-data step of the data ingestion task, causing pipeline failures.

Proposed Solution

Allow users to specify a machine type with more memory directly via a new CLI option in agents-cli data-ingestion.

Example:

agents-cli data-ingestion --remote --machine-type <machine_type>

### Steps to Reproduce

1. Initiate the data ingestion pipeline
2. Process a large bucket (GCS) of files to ingest.
3. Monitor the pipeline progress and observe the failure during the process-data step.
4. Error output: Replicas low on memory: workerpool0. Specify a machine with larger memory and try again.

### What did you expect to happen?

- No issue during pipeline executionb


### Client information

```Bash
agents-cli info

CLI version: 0.5.1
OS info: Linux-6.18.33.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
Installed skills: none

Command Output / Logs

Image

Anything else we need to know?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions