Before changing the codebase, start with the root agent/developer contract and the active directive:
AGENTS.mdfoundation_progress/FUNGMOD_CENTRAL_GOAL_VIRTUAL_EXPERIMENTS.mdfoundation_progress/FUNGMOD_NEXT_PHASES_ROADMAP.md
Historical foundation-first plans are archived under old_progress/. They are
useful context only and are non-binding. The active goal is now the
virtual-experiment engine: simulate degradation dynamics over time without
hiding assumptions, uncertainty, provenance, missing inputs, or unsupported
biology.
FungMod is a scientific Python codebase for building a physically grounded fungal- and enzyme-mediated substrate-degradation virtual-experiment engine. The long-term target is a modular API that can simulate a fungus or enzyme source, substrate, environment, and parameter set without inventing biological facts.
This repository currently implements the validated foundation plus the first basic kinetics layer:
- unit-aware parameters and parameter sets,
- explicit assumptions and simulation records,
- a generic deterministic ODE reaction engine,
- non-negativity, mass-balance, and limiting-case validation helpers,
- homogeneous dissolved-substrate Michaelis-Menten rate laws,
- PET substrate metadata with explicit unknown physical parameters,
- a minimal heterogeneous PET surface-hydrolysis rate law,
- Arrhenius temperature scaling with validity-range warnings,
- Gaussian pH activity scaling with validity-range warnings,
- minimal fungal metadata, enzyme secretion, enzyme decay, maintenance, and product-coupled biomass growth,
- stoichiometric and thermodynamic metadata interfaces,
- carbon conservation, oxygen limitation, and biomass-yield validation checks,
- explicit reaction-quotient Gibbs feasibility and entropy-production diagnostics for caller-supplied dimensionless Q and entropy-rate metadata,
- configured thermodynamic JSON/CSV summary outputs for explicit Gibbs/entropy validation diagnostics,
- 1D finite-volume reaction-diffusion with explicit boundary conditions,
- universal substrate metadata interfaces with PET, cellulose, lignin, starch, and chitin substrate classes,
- least-squares calibration utilities with train/validation residual reporting,
- Monte Carlo uncertainty propagation and local sensitivity analysis,
- process-centered assembly scaffolding with structured missing-process, missing-parameter, and incompatible-unit reports,
- a standardized result/export object that writes reports, CSV tables, logs, and figures,
- generic homogeneous process classes for first-order, mass-action, and Michaelis-Menten benchmark models,
- generic surface adsorption/catalysis process components that can run with PET or a dummy non-PET substrate,
- explicit
Environment,Geometry, andEnzymeentities for process-centered assembly, - environment-driven temperature, pH, water-activity, oxygen, and product inhibition modifiers,
- compatibility checks for enzyme/substrate/bond/fungus pairings during model assembly,
- a top-level notebook set that imports package code rather than redefining core model logic,
- a researcher-facing product-tour notebook for the public
virtual_experiment(...)API and standard output tables, - a researcher-facing screen-comparison notebook that writes report artifacts
and inspects
comparison_summary.csvguardrails without ranking metadata-only environment grids, - human-editable YAML config folders for fungi, substrates, enzymes, environments, geometries, parameters, and experiments,
- schema-checked config loaders with explicit unknown-value handling,
- an optional PET plugin convenience helper that delegates to the generic configured workflow with an explicit plugin registry,
- software-test benchmark configs that are explicitly non-scientific;
- a registry-backed exploratory virtual-experiment API for Reaction 618 and the controlled BIO-001 surface-degradation pilot;
- schema-versioned virtual-experiment output tables with provenance, mechanism-summary, limitations, missing-parameter, suggested-experiment, and range-use fields;
- mechanism summaries that can expose active configured rate modifiers such as
explicit reversible product inhibition when registry-backed case templates
provide product-state and positive unit-compatible
K_irecords; - an offline-first SABIO-RK source adapter that loads frozen kinetic-law snapshots and writes review-only proposed records without mutating the simulation registry;
- a registry-backed BIO-002 extracellular enzyme-chain assembler whose stoichiometry, conserved quantities, entities, and output labels come from template data rather than mechanism-code biological names;
- a scoped CASE-001 researcher-facing path that runs the existing BIO-002
cellulose-equivalent enzyme-chain virtual experiment from names and aliases
through the top-level
virtual_experiment(...)API.
It does not yet implement full thermodynamic flux analysis, resolved intracellular
metabolism, 2D/3D spatial models, publication-grade calibration against
curated biological datasets, or global uncertainty analysis. Those stages are
documented in progress.md and should be added only after the current virtual
experiment layer has tests, provenance, and validation.
The model is designed to fail honestly. Physical quantities carry units. Parameters require provenance before a scientific simulation can run, unless a test explicitly sets allow_unsourced_for_testing=True. Missing values are represented as missing values rather than guessed numbers. Validation failures are returned as results, not hidden.
Biology may be added only when the mechanism is explicitly implemented, provenance-backed, maturity-labelled, covered by tests, and honest about assumptions and limitations. Unsupported, invented, silently guessed, or falsely validated biology is forbidden.
python3 -m pip install -e ".[dev]"pytestCI is required before merging. The CI workflow installs .[dev] and runs the
package-quality gates below:
python -m ruff check src tests
python -m pyright --pythonpath "$(python -c 'import sys; print(sys.executable)')"
python -m pytest --cov=fungal_model --cov-report=term-missing --cov-report=xmlThe current Pyright gate resolves imports from the active Python interpreter
and enables the main argument, assignment, return, operator, call, attribute,
and type-form diagnostics. Optional member access remains the active typing
ratchet documented in ARCHITECTURE_DEBT.md.
Coverage currently has an 80% minimum gate.
Branch protection expectations are documented in .github/BRANCH_PROTECTION.md.
The protected default branch should require pull requests, passing CI,
up-to-date branches, no force pushes, and no unaudited direct bypass.
from fungal_model import VirtualExperiment
study = VirtualExperiment.from_registry(
fungi=["sabiork_beta_glucosidase_source"],
substrates=["cellobiose"],
environments=["sabiork_reaction_618_selected_conditions"],
)
result = study.simulate(mode="exploratory", n_samples=128)Blocked or scientific-mode cases can be inspected without running a model:
study.write_preflight_report(mode="scientific")The standard output folder includes long-format time series, final states,
final metrics, threshold times, sampled parameters, summary metrics,
guarded screen-comparison summaries, modelability item reports, assumption
summaries, mechanism summaries, provenance, limitations, missing-parameter and
suggested-experiment tables, and a versioned data dictionary/schema.
Preflight tables include machine-readable simulation policy columns such as
simulation_allowed_for_mode, blocking_reason, and
recommended_next_action.
Exploratory priors remain allowed, but the tables mark them as assumptions
rather than literature-curated values.
comparison_summary.csv indexes existing final-metric and threshold rows for
researcher-facing side-by-side inspection while preserving standard
environment comparison/ranking guardrails; metadata-only runtime environment
grids remain explicitly blocked from ranking or response-plot interpretation.
Results can also render a deterministic Markdown report from those standard
tables without adding validation or calibration claims. Optional HTML artifacts
can be written beside the Markdown report for browser viewing: an HTML sidecar
over the same report and an index page that links existing report, table,
manifest, and quicklook files without reinterpreting scientific values:
result.write_report("outputs/report/", include_html=True, include_index=True)The scoped CASE-001 BIO-002 chain can also be run from researcher-facing names and a runtime environment grid:
from fungal_model import environment_grid, virtual_experiment
study = virtual_experiment(
fungi="generic cellulase source",
substrates="cellulose film",
environments=environment_grid(
temperature_C=[25, 30, 35],
ph=[4.5, 5.0, 5.5],
oxygen="aerobic",
),
)
result = study.simulate(mode="exploratory", n_samples=1)This CASE-001 path is exploratory and enzyme-chain/cellulose-equivalent only. It is not whole-fungus growth, secretion, uptake, biomass, PET, lignin, full lignocellulose, organism-specific physiology, or empirical validation. Runtime environment-grid values are metadata unless an explicit response law or condition-specific parameter record is active.
For a complete public-API walkthrough of guarded screen comparisons, report
artifacts, and metadata-only environment-grid limitations, see
notebooks/examples/13_screen_comparison_summary_example.ipynb.
The stable API is intentionally generic-first. These names are supported from
top-level fungal_model for researcher-facing virtual experiments, config
loading, model assembly, execution, and result inspection:
VirtualExperimentvirtual_experimentEnvironmentGridenvironment_gridEnvironmentCaseDegradationScreenResultVirtualExperimentErrorrun_configured_modelload_model_configload_substrateload_geometryload_product_mapload_parameter_setModelBuilderAssembledModelProcessLibraryProcessRegistryProcessODESolverRunRequestSimulationResultParameterParameterSet
The same virtual-experiment names are also available from fungal_model.api.
PET-specific convenience helpers are not part of the top-level public API.
They live under fungal_model.plugins.pet, where plugin users can explicitly
import pet_substrate_loader_registry, PETSurfaceWorkflowConfig, and
run_pet_surface_integration.
Former process.as_reaction() users should now choose one explicit path:
build process-centered models through ModelBuilder, AssembledModel.run(),
or run_configured_model; or construct a low-level
fungal_model.chemistry.reactions.Reaction directly when using
SimulationEngine or ReactionDiffusionEngine1D. Concrete Process classes
no longer provide a process-to-Reaction adapter bridge.
The notebooks/examples/ folder contains software-test notebooks for
configured workflow plumbing, configured thermodynamic-output inspection, plus
one researcher-facing exploratory product tour. The product tour is not
empirical validation:
00_quickstart.ipynb01_config_entity_inspection.ipynb02_failure_report.ipynb03_configured_outputs.ipynb10_virtual_experiment_product_tour.ipynb11_thermodynamics_entropy_diagnostics.ipynb12_reversible_product_inhibition_example.ipynb13_screen_comparison_summary_example.ipynb
Notebook tests check that notebooks import fungal_model, avoid defining core
rate laws/classes or low-level solvers inline, and execute every foundation,
product-tour, and thermodynamic-output smoke path. The thermodynamics notebook
uses configured explicit-Q Gibbs and entropy-production-rate metadata only; it
does not infer activities, reaction quotients, concentrations, redox
potentials, or solver-time thermodynamic enforcement.
The product-inhibition notebook demonstrates the generic reversible
1 / (1 + P / K_i) modifier through the public virtual-experiment API and
standard output tables with an explicit exploratory example K_i; it is not
validation, calibration, toxicity, uptake, secretion, biomass, whole-fungus
physiology, or multi-product inhibition evidence.
Top-level YAML configs live under data/model_configs/, data/fungi/,
data/substrates/, data/enzymes/, data/environments/, data/geometries/,
data/parameters/, and data/experiments/. Loaders are exposed from
fungal_model as load_fungus, load_substrate, load_enzyme,
load_environment, load_geometry, and load_parameter_set.
Toy and synthetic assets in data/ are software-test or example fixtures, not
scientific records. They remain available for tests and configured-workflow
examples, but researcher-facing work should start from the registry-backed
virtual-experiment API and inspect table provenance before interpreting any
output.
Internal software-test model-config shells include:
data/model_configs/toy_homogeneous_ab.ymldata/model_configs/toy_surface_pet_plugin.ymldata/model_configs/toy_surface_dummy_non_pet.yml
All three load through load_model_config. They are framework benchmarks, not
scientific biology.
Product maps live under data/product_maps/ and are loaded through
load_product_map. They carry configured state names and benchmark maturity
metadata, so product release mappings do not have to be embedded in process
code or a substrate-specific workflow.
Source discovery is intentionally separate from simulation. SabioRKSource
loads frozen SABIO-RK kinetic-law snapshots by default and can refresh only
through an explicit live-fetch hook. Proposed product maps, parameter records,
and process-compatibility records are written for human review under a proposal
bundle; they are not silently committed into the simulation registry. Use
scripts/fetch_sabiork_kinlaw_entries.py to freeze raw SABIO-RK exports and
scripts/propose_sabiork_source_records.py to create review-only proposal
artifacts from a frozen snapshot.
Foundation process configs can be built through ProcessLibrary.default_foundation().
The current library provides factories for first-order, mass-action,
homogeneous Michaelis-Menten, and generic surface-catalysis benchmark
processes. These are framework mechanisms, not organism- or substrate-specific
biology.
Assembled process models now support native well-mixed execution through
AssembledModel.run(). The method delegates to ProcessODESolver, returns a
standard SimulationResult, records process-rate trajectories, runs supplied
validators, and rejects unsupported geometry instead of silently switching
execution paths.
Substrate, geometry, product-map, and validator loading now goes through
registries. The default substrate registry is generic-first and supports
foundation benchmark substrates such as generic_solid and
generic_dissolved. PET substrate loading is available only through the
explicit PET plugin registry:
from fungal_model import load_substrate
from fungal_model.plugins.pet import pet_substrate_loader_registry
substrate = load_substrate(
"data/substrates/pet_film.yml",
registry=pet_substrate_loader_registry(),
)Configs are intentionally provenance-heavy. Top-level records and parameter
entries must include source, measurement method, confidence, notes, validity
range, units, and value fields. Unknown scientific values should be written as
value: null; loaders preserve them as explicit unknown parameters.
The generic configured-model API is the public workflow entry point:
from fungal_model import load_model_config, run_configured_model
config = load_model_config("path/to/model_config.yml")
result = run_configured_model("path/to/model_config.yml")run_configured_model now loads entities through registries, merges configured
and entity parameter sets, builds processes through ProcessLibrary, assembles
a ModelBuilder, executes through AssembledModel.run(), validates the
SimulationResult, and saves the standard output bundle when an output
directory is configured.
Configured output folders include the core SimulationResult files plus
configuration-facing artifacts:
input_model_config.jsonconfigured_model_run.jsonconfigured_metadata.jsonprocess_build_decisions.jsoninitial_state.jsontime_grid.jsonvalidators.jsonmerged_parameters.jsonentity_snapshots/output_manifest.json
Plugin-backed configured runs use explicit registry injection. The bundled PET plugin config is an internal software-test fixture, not a scientific PET degradation record.
The older PET convenience entry point now lives under
fungal_model.plugins.pet and delegates to run_configured_model; the generic
fungal_model.workflows package stays substrate-neutral.
Current capability labels mean:
-
implemented: code exists for the stated scope. -
technically verified: repository tests prove the software contract, not the empirical biology. -
exploratory: outputs are provenance-labelled assumptions, priors, ranges, or controlled pilots. -
scientifically validated: empirical validation data support the prediction claim. FungMod does not currently bundle publication-grade validation for arbitrary fungus/substrate/environment predictions. -
unsupported: the mechanism or workflow should fail explicitly or remain documented as future work. -
Well-mixed ODE systems and an initial 1D reaction-diffusion engine are supported.
-
Michaelis-Menten kinetics currently means homogeneous dissolved-substrate kinetics only.
-
PET surface hydrolysis currently uses a minimal equilibrium Langmuir coverage model with constant accessible surface area.
-
PET product release is represented as a lumped mass-equivalent hydrolysate in the Stage 4 example, not resolved MHET/BHET/TPA/EG chemistry.
-
Temperature scaling currently uses Arrhenius acceleration only; enzyme thermal deactivation is recorded as a limitation and is not implemented.
-
pH activity currently uses an empirical Gaussian profile; mechanistic ionization chemistry is not implemented.
-
Fungal growth currently uses a simple assimilable-product uptake law; oxygen, transporters, toxicity, regulation, and intracellular metabolism are not modelled.
-
Enzyme production has an explicit active-biomass cost, but the cost parameter is lumped and must be sourced before scientific use.
-
Stage 7 oxygen handling is currently a validation check against available oxygen, not a coupled oxygen state in the ODE model.
-
Gibbs free energy values are metadata with provenance; full thermodynamic feasibility constraints are not yet enforced by the solver.
-
Spatial modelling is currently 1D finite-volume method-of-lines only.
-
Stage 8 diffusion fields are unit-aware, but geometry is a simple uniform 1D grid; 2D, variable geometry, and true volume/area coupling are not implemented.
-
PET is marked
partial. Cellulose has narrow registry-backed exploratory BIO-001/BIO-002 surface and enzyme-chain paths, but the genericCelluloseSubstrateclass remains Stage 9 placeholder metadata and is not a validated default cellulose-degradation model. Lignin, starch, and chitin remain Stage 9 placeholder metadata classes with unknown physical parameters and no default degradation model. -
Universal substrate modules record bond classes, required enzyme classes, and product classes, but they do not implement substrate-specific kinetics, accessibility models, thermodynamic constraints, or assimilation evidence.
-
Calibration utilities are generic least-squares tools; no literature data are bundled and no parameters are calibrated by default.
-
Monte Carlo and local sensitivity utilities require explicit uncertainty/perturbation specifications; Bayesian calibration and global sensitivity are not implemented.
-
AssembledModel.run()currently supports well-mixed process ODE execution; unsupported geometry fails before simulation. -
The generic configured workflow currently supports foundation process factories and well-mixed execution; unsupported process types and geometry fail before simulation.
-
The standardized
results.SimulationResultis native output forAssembledModel.run()and is also produced by explicit low-level reaction and reaction-diffusion APIs when those APIs are invoked directly. -
Generic surface catalysis now exists, and PET composes it through a PET accessibility adapter, but resolved PET product chemistry and dynamic morphology remain future work.
-
Geometry abstractions currently wrap well-mixed and 1D film cases; particle, slab, and porous-medium geometries are honest metadata placeholders.
-
Enzyme/fungus compatibility matching checks declared capabilities, but it does not yet auto-build full living-fungus ODE systems from entities.
-
The PET plugin convenience helper is a deprecated compatibility slice; the generic configured workflow is the main foundation path.
-
PET must not be treated with the homogeneous Michaelis-Menten layer except as an explicitly labelled artificial benchmark.
-
The reaction engine assumes each reaction rate can be converted into every affected species unit per simulation time unit.
-
Mass-balance validation requires the caller to provide conserved weights when species do not share directly compatible units.
-
Solver tolerances are numerical settings, not physical parameters, and are recorded in the simulation record.