Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
7818ddf
Initial DAG for LINC+delay
tikk3r May 23, 2026
da28933
Add download step
tikk3r May 28, 2026
e3c5ae1
Fix small errors
tikk3r May 28, 2026
e9612d6
delay cal, rundir and outdir
tikk3r May 31, 2026
5bd78a3
dd cal dag entry
tikk3r Jun 5, 2026
9910c17
Tweak dd cal
tikk3r Jun 5, 2026
834d888
Return field from linc target validation
tikk3r Jun 9, 2026
0b698a5
Get dd status column
tikk3r Jun 9, 2026
125de55
Fix target retrieval in ddcal
tikk3r Jun 9, 2026
abd97e8
Set whole field to finished once ddcal finishes successfully
tikk3r Jun 9, 2026
245b360
One target list per field
tikk3r Jun 9, 2026
137c456
Sort priority descending
tikk3r Jun 9, 2026
f1966ca
Empty explicit paths
tikk3r Jun 10, 2026
cc5421b
Fix delay sols search path
tikk3r Jun 10, 2026
b72eec1
Fix suffix search
tikk3r Jun 10, 2026
ee81117
Tweak most recent dir
tikk3r Jun 10, 2026
e36d327
Fix target search path in delay
tikk3r Jun 15, 2026
28037f2
Fix suffix for delay calibration
tikk3r Jun 15, 2026
16781d2
More power: restarts and bad nodes
tikk3r Jun 15, 2026
2135952
Remove outdated LINC dag
tikk3r Jun 15, 2026
e9741f4
Attempt to retrieve rundir on the fly
tikk3r Jun 16, 2026
0644f6e
Small updates
tikk3r Jun 16, 2026
74dd7ef
Remove explicit paths
tikk3r Jun 18, 2026
b761683
Fix dd cal and update success check
tikk3r Jun 21, 2026
f59f56f
Add some documentation
tikk3r Jun 21, 2026
062e8bc
Implement calibrator2 and homogenise success triggers
tikk3r Jun 21, 2026
3e181ca
Fix adding a field
tikk3r Jun 24, 2026
f99d192
Draft widefield dag
tikk3r Jun 24, 2026
52a36d4
Update readme
tikk3r Jun 24, 2026
354b154
Expand readme with expected folder setup
tikk3r Jun 24, 2026
e50a28d
Allow manual approval of delay solutions for widefield
tikk3r Jun 24, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# End-to-end processing of ILT HBA data with flocs

This package aims to provide relatively simple end-to-end automatic processing of ILT HBA data. Where `flocs-runners` provides the interface to running pipelines, `flocs-processing` is the scaffolding to tie it together. Data reduction is coordinated via a dedicated SQLite database that holds information on which observations to process, which pipelines to run for them and all of the related statuses. Orchestration of all the pipelines is handled via Airflow through a DAG.

## Folder setup
Flocs-processing requires three folders to be setup:

* A processing folder -- this is where data is stored while processing
* A data folder -- this is where the input data is found
* An output folder -- this is where finished pipeline outputs are copied to, and searched for in steps that depend on it.

The expected naming directory structure for input data is `<data folder>/<field name>/{calibrator,target}`. Inside the calibrator and target folders, the observations should follow the usual `LXXXXXX` naming scheme. These **must** match the SAS IDs in the database for flocs to be able to find them.

## Database setup
A database for processing is created via `flocs-processing create-database`. This will create an empty database with the necessary columns. Datasets to process can be added via `flocs-processing add-field`.

## Processing data
To start processing data, Airflow needs to be running. This will be delegated to `flocs-processing process-from-database` in the future, but for now requires running Airflow manually. For setup do the following:

1. Install airflow: `uv pip install apache-airflow`
2. Set up a folder that wil contain all of Airflow's own stuff and assign it to the `AIRFLOW_HOME` environment variable.
3. Run `airflow config list --defaults > "${AIRFLOW_HOME}/airflow.cfg"`
4. Define `AIRFLOW__CORE__DAGS_FOLDER` as `${AIRFLOW_HOME}/dags` and create the folder. Copy the DAGs inside `flocs_processing/dags` to this folder.
5. Define `AIRFLOW__CORE__LOAD_EXAMPLES` as `False`

Next start a screen or tmux session and run `airflow standalone` to start the Airflow instance. It will echo a user name and password on first start, but will also store the credentials in `${AIRFLOW_HOME}/simple_auth_manager_passwords.json.generated`. The Airflow instance will start on port 8080. You can access it via `localhost:8080` in your browser. If it is running on a remote cluster, you can set up a tunnel via e.g. `ssh -N -L 8080:localhost:8080 <remote>` to forward it to your local machine.

Once `flocs-processing` is complete the processing loop will be automatic, but for now the user must trigger the DAG manually. On the "Dags" tab you should now see the flocs DAGs available. To manually trigger one, click on the name and on the subsequent page use the "Trigger" button in the top right.

Loading