+ );
+};
diff --git a/fern/versions/latest.yml b/fern/versions/latest.yml
index fcfbb980e..834495433 100644
--- a/fern/versions/latest.yml
+++ b/fern/versions/latest.yml
@@ -81,6 +81,26 @@ navigation:
contents:
- page: Recipe Cards
path: ./latest/pages/recipes/cards.mdx
+ - section: Image Generation
+ contents:
+ - page: Rich Document Image Generation
+ path: ./latest/pages/recipes/image_generation/rich_document_images.mdx
+ - page: Product Image Variations
+ path: ./latest/pages/recipes/image_generation/product_image_variations.mdx
+ - page: Funny Pet Image Edits
+ path: ./latest/pages/recipes/image_generation/funny_pet_image_edits.mdx
+ - page: Autonomous Vehicle Traffic Scenarios
+ path: ./latest/pages/recipes/image_generation/traffic_scenarios.mdx
+ - page: Synthetic Extremity X-rays
+ path: ./latest/pages/recipes/image_generation/medical_extremity_xrays.mdx
+ - page: Airport Baggage Screening Scans
+ path: ./latest/pages/recipes/image_generation/airport_security_scans.mdx
+ - page: Humanoid Robot Scene Understanding
+ path: ./latest/pages/recipes/image_generation/humanoid_robot_scene_understanding.mdx
+ - page: Crop Disease Detection Images
+ path: ./latest/pages/recipes/image_generation/agriculture_crop_imagery.mdx
+ - page: Drone Aerial Inspection
+ path: ./latest/pages/recipes/image_generation/drone_aerial_inspection.mdx
- section: Code Generation
contents:
- page: Text to Python
@@ -145,6 +165,8 @@ navigation:
contents:
- page: Overview
path: ./latest/pages/devnotes/index.mdx
+ - page: Image Generation for Multimodal Data Pipelines
+ path: ./latest/pages/devnotes/posts/image-generation-for-multimodal-data-pipelines.mdx
- page: Designing Nemotron-Personas
path: ./latest/pages/devnotes/posts/nemotron-personas.mdx
- page: Prompt Sensitivity
diff --git a/fern/versions/latest/pages/devnotes/index.mdx b/fern/versions/latest/pages/devnotes/index.mdx
index c56ab2c77..5db4f15d2 100644
--- a/fern/versions/latest/pages/devnotes/index.mdx
+++ b/fern/versions/latest/pages/devnotes/index.mdx
@@ -9,6 +9,14 @@ import { BlogCard, BlogGrid } from "@/components/BlogCard";
Welcome to NeMo Data Designer Dev Notes — in-depth guides, benchmark write-ups, and insights from the team building NeMo Data Designer.
+ }
+ />
+
+At [PyData London 2026](https://www.youtube.com/watch?v=QSpVpNcDqso&list=PLGVZCDnMOq0rFQykYJg7t441AEpN4SszE&index=5), we ran a hands-on session on synthetic data pipelines with [NeMo Data Designer](https://github.com/NVIDIA-NeMo/DataDesigner/) and [NeMo Anonymizer](https://github.com/NVIDIA-NeMo/Anonymizer/). We put together a set of notebooks that ramped up gradually: text QA first, then multimodal visual QA, then privacy-safe feedback data.
+
+For the VQA notebook, it would have been more straightforward to start from an existing PDF dataset on Hugging Face. The catch is that we would still need to inspect and filter that dataset until it had the document types we wanted for the workshop: layouts, charts, tables, annotations, scan artifacts, and enough visual evidence to support real questions. It was more interesting, and honestly easier, to build that seed dataset with Data Designer and declare the distribution directly.
+
+It worked out better than I expected. Each generated document stayed tied to the sampler controls that produced it, which made the dataset easy to inspect, tweak, and reuse. Folding images into a synthetic data generation (SDG) workflow unlocks a lot of interesting use cases, and Data Designer makes the transition seamless: moving from text to images feels familiar, so you can focus on designing the dataset instead of wrestling with the plumbing of the pipeline.
+
+In this dev note, we'll briefly touch on the document-generation use case from PyData, then explore other image SDG use cases while having some fun along the way. Hint: funny pet images are involved. 😂
+
+{/* more */}
+
+
+
+## Not Just Model Training
+
+Training data is the obvious headline when it comes to SDG. It is also too narrow. Generated images are also useful wherever you need controlled visual cases: evaluation datasets, benchmark slices, reviewer training, privacy-safe demos, etc. For many teams, those datasets become core development infrastructure.
+
+The common need is control. What you really want is to define the parameters that guide your dataset's distribution. That gives you ground-truth metadata for every image, which you can lean on later in your workflows: you know what is in each image, what change was applied, and what behavior the row is supposed to exercise. Did the VLM answer from evidence in the document, or from a plausible guess? Did a reviewer catch the dense electronics cluster in a baggage scan? Did an apparel edit preserve the garment after changing the scene? Did a prompt regress on rainy-night ego-camera frames after a model upgrade?
+
+There is also a very practical reason: image pipelines are fun to build. The effect of a sampler change is visible. If you change weather, camera angle, crop severity, scan quality, garment styling, or robot viewpoint, you can usually see the distribution move in the preview rows very easily. Text pipelines can be harder to reason about because the differences are often subtle or buried inside longer generations. With images, the iteration loop has a nice immediacy: change a control, preview a few rows, and your eyes tell you whether the pipeline is moving in the right direction.
+
+That immediacy is not just a nicer authoring experience. It turns generated images into practical development assets across a few recurring workflows:
+
+| Workflow | Concrete example |
+| --- | --- |
+| Evaluation dataset creation | Build a 500-row document VQA set where only scan quality, chart type, and table density vary, then measure whether a VLM still answers evidence-grounded questions. |
+| Benchmark datasets | Publish a fixed slice of rainy-night traffic scenes with known road type, weather, lighting, and hazard metadata so model releases can be compared against the same visual conditions. |
+| Human training and calibration | Give reviewers airport-screening-style or crop-disease examples with known high-level labels so teams can practice a rubric before touching governed real data. |
+| Regression testing | Keep a small suite of product image edits that should preserve the product shape and color while changing the background, then rerun it after model, prompt, or provider changes. |
+| Failure analysis | Generate variants around a known weak spot, such as low-contrast charts with glare, cluttered robot workspaces, partially occluded products, or creased phone-photo scans. |
+| Labeling taxonomy design | Create synthetic agriculture, humanoid-robot, or airport-baggage examples to decide where labels such as "minor occlusion," "severe stress," or "unsafe obstruction" should begin and end. |
+| Product demos and sales engineering | Show invoice extraction, product QA, or drone damage-assessment workflows with realistic synthetic examples instead of customer documents, patient imagery, facility photos, or operational scans. |
+| Privacy-safe sharing | Reproduce a multimodal bug with synthetic medical-style, logistics, or security-review images so another team can debug the pipeline without receiving sensitive source imagery. |
+| Synthetic monitoring | Run a nightly canary set of generated documents, product variants, traffic scenes, and robot-view images to catch drift in deployed prompts, judges, or model endpoints. |
+
+Model training can still be the destination. But generated images are just as valuable as controlled visual test cases: small enough to inspect, repeatable enough to compare over time, and structured enough to become evaluation rows, review exercises, benchmark slices, monitoring canaries, or debugging cases.
+
+## The Image Building Blocks
+
+Here is the part I like most: moving from text to images does not feel like switching frameworks.
+
+You still declare columns, dependencies, model aliases, samplers, structured outputs, validators, and processors. The engine still resolves the DAG. `preview()` still gives you a small thing to look at before you scale. `create()` still writes the artifact-backed dataset. The difference is that some columns now produce images, and other columns can edit, caption, inspect, extract, score, or transform those images.
+
+In `create()` mode, the image bytes are saved with the dataset, typically under `images//` with UUID-style filenames. The dataframe stores relative paths instead of giant image blobs, which keeps large image payloads out of memory-heavy row data as pipelines scale. That small detail matters: a generated image can be opened from disk, rendered in a preview table, or passed into the next image-aware column through `ImageContext`, while the same row keeps the sampler values that explain why the image exists.
+
+That gives you three building blocks that feel familiar once you have written any Data Designer pipeline. Text-to-image columns generate visual examples from controlled metadata and prompt templates. Image-to-image columns use an existing or generated image as context, then produce controlled variants. Image-to-text columns send images into VLM-backed columns for captions, QA pairs, labels, extraction, or judge decisions.
+
+That last part is the real developer-experience win. The image is not an orphaned file on disk with a mysterious filename. It remains part of the row. It can move into a VLM judge, become a benchmark example, feed an editing step, or sit in a preview gallery with the exact metadata that produced it.
+
+## A Quick Document QA Preview
+
+For the PyData workshop, the document seed dataset came from the optional [`bonus_generate_documents.ipynb`](https://github.com/nabinchha/pydata-london-2026-data-designer-anonymizer/blob/main/notebooks/bonus_generate_documents.ipynb) notebook. It is a small notebook with a useful job: make business-document pages where I can choose the document type, layout, visual content, annotation style, and scan condition instead of hoping those cases exist in a downloaded dataset.
+
+The full notebook includes the model setup, sampler columns, export step, and preview helpers. The core image column is just a normal Data Designer column:
+
+```python
+config_builder.add_column(
+ dd.ImageColumnConfig(
+ name="document_image",
+ prompt="""
+Create a realistic single-page business document image with rich visual information.
+
+Document requirements:
+- Document type: {{ document_type }}
+- Layout style: {{ layout_style }}
+- Physical/rendering condition: {{ document_condition }}
+- Annotation layer: {{ annotation_layer }}
+
+Required visual content:
+- Primary visual: {{ primary_visual }}
+- Secondary visual: {{ secondary_visual }}
+- At least one readable table with row and column labels
+- At least one chart, timeline, heatmap, diagram, or KPI-card cluster
+- Enough visual evidence to ask questions about values, trends, labels, dates, and relationships
+""",
+ model_alias="document-generation-model",
+ )
+)
+```
+
+Those generated pages became the visual seed data for the document VQA notebook. We do not need to unpack the full VQA pipeline here. The useful bit is simpler: during preview, each image sits beside the metadata that shaped it. You can immediately see whether the run is giving you the document types and visual conditions you intended.
+
+Here are a few rows from that generated dataset. The image is the fun part. The sampler values are what make the image useful.
+
+
+
+
+
+
+
+
+
+
+
+## Image Editing Is Just Another Dependency
+
+Image-to-image editing uses the same idea. Generate one image or use an existing one from a seed dataset, pass it forward as context, and ask the next column to make a controlled change. The dependency is declared with `ImageContext`:
+
+```python
+config_builder.add_column(
+ dd.ImageColumnConfig(
+ name="animal_portrait",
+ prompt="A close-up portrait photograph of a {{ animal }} looking at the camera, studio lighting, high quality.",
+ model_alias="image-model",
+ )
+)
+
+config_builder.add_column(
+ dd.ImageColumnConfig(
+ name="edited_portrait",
+ prompt=(
+ "Edit this {{ animal }} portrait photo. "
+ "Add {{ accessory }} on the animal. "
+ "Place the {{ animal }} in {{ setting }}. "
+ "Render the result in {{ art_style }}. "
+ "Keep the animal's face, expression, and features faithful to the original photo."
+ ),
+ model_alias="image-model",
+ multi_modal_context=[dd.ImageContext(column_name="animal_portrait")],
+ )
+)
+```
+
+There is a little bit of plumbing under the hood. In preview mode, the first image is base64 and can be passed directly to the second column. In create mode, the first image is stored as a relative path under the output artifact directory; Data Designer resolves it back to base64 before sending it to a remote model endpoint.
+
+As a user, you do not have to write that glue code. You declare the dependency. The engine handles the mechanics.
+
+That is enough to build some very practical loops:
+
+- Generate a clean document, then edit it into a low-quality scan.
+- Generate a product image, then produce regional packaging variants.
+- Generate a base traffic scene, then add weather, lighting, or hazard conditions.
+- Generate an image, caption it with a VLM, then judge whether the caption matches.
+
+The important constraint is model capability. Diffusion-style image routes are excellent for text-to-image generation, but they generally do not consume image context. Autoregressive multimodal image models can support image-to-image workflows through the chat-completions route. Data Designer exposes both through the same column API while leaving model choice explicit.
+
+
+The examples in this dev note use `google/gemini-3.1-flash-image-preview`. Data Designer makes the workflow consistent across models, but the selected model still sets the ceiling for visual quality, text rendering, instruction following, editing fidelity, domain realism, and how closely the output follows your sampler controls.
+
+
+## A Tour Across Domains
+
+Instead of walking through one giant example line by line, let's look at a few use cases. Image generation gets more interesting when you see it jump domains: documents, apparel, ego-camera traffic scenes, robot viewpoints, baggage scans, X-rays, crops, and drone frames. The code shape stays familiar, but the distributions change completely. Each use case also has an accompanying recipe linked at the bottom if you want to explore it on your own.
+
+For each one, the image is paired with the sampler controls that produced it. That pairing matters. A pretty image is nice. A pretty image with the exact knobs that created it is something you can debug, filter, rerun, judge, or hand to a reviewer.
+
+
+Some examples in this section touch specialized or safety-sensitive domains: medical imaging, airport screening, autonomous vehicles, robotics, agriculture, and infrastructure inspection. Treat them as synthetic examples for prototyping, evaluation, curriculum design, and review workflows. Domain experts should define the distributions, review the outputs, and decide what is appropriate before anything operational.
+
+
+### Product image variations
+
+Product imagery is a nice place to start because the pain is easy to understand. A brand may have one clean catalog image of an adult model wearing a garment, but the downstream needs multiply quickly: white-background thumbnails, lifestyle shots, fit-guide photos, adaptive-fashion examples, seasonal campaign images, and inclusive catalog coverage across adult age groups, ethnicities, body types, poses, and accessibility contexts.
+
+In Data Designer, the reference product image can come from a seed dataset or from an upstream `ImageColumnConfig`. The edit column receives that image through `ImageContext` and changes the scene while trying to preserve the product's identity:
+
+```python
+config_builder.add_column(
+ dd.ImageColumnConfig(
+ name="product_variant_image",
+ prompt=PRODUCT_VARIATION_PROMPT,
+ model_alias="product-image-model",
+ multi_modal_context=[dd.ImageContext(column_name="base_product_image")],
+ )
+)
+```
+
+The important thing is that the creative brief is not trapped inside one long prompt. Apparel item, colorway, variation goal, adult model age group, ethnicity, body type, accessibility context, styling context, composition, and lighting are all columns. That means you can inspect the distribution, filter it, rerun slices, and add a VLM judge to check whether the garment stayed consistent or whether the edit introduced unwanted logos, text, minors, or unsafe styling.
+
+
+
+
+
+
+
+
+
+
+
+### Autonomous vehicles
+
+Autonomous-vehicle teams care about the long tail: the thing that happens rarely, in bad weather, at an awkward intersection, with just enough weirdness to matter. Those cases are hard to collect deliberately, and many are not safe to stage.
+
+The recipe generates ego-camera views from the self-driving vehicle itself. Each row also carries the scenario metadata that explains what the frame is trying to test:
+
+
+Synthetic traffic scenes are useful for scenario exploration, prompt development, visual QA, and review workflows. They do not replace simulator testing, closed-course validation, real sensor logs, or safety analysis by autonomous-vehicle experts.
+
+
+The controls cover geographic region, road type, weather, time of day, traffic density, vehicle mix, road surface, traffic control, and the scenario element.
+
+The `scenario_element` sampler is where the recipe earns its keep: a child chasing a ball, a school bus with a flashing stop sign, a jaywalking pedestrian at night, a cyclist riding against traffic, a fire truck in the oncoming lane, a malfunctioning traffic light, debris in the roadway. Synthetic images do not replace simulation, closed-course testing, or sensor logs. They do give you a fast way to build controlled perception and review sets around situations you want to discuss before you go hunting for them in real data.
+
+
+
+
+
+
+
+
+
+
+
+### Humanoid robot scene understanding
+
+Humanoid robot examples get interesting when the camera has a body. The question is not just "what objects are in the room?" It is what the robot can see from its pose, what it can reach, whether the path is safe, what changed on the table, and whether a nearby adult matters to the task.
+
+Synthetic scenes let you vary the room, camera pose, object set, hazard, clutter, lighting, and adult human presence without collecting imagery from real homes, labs, hospitals, or workplaces.
+
+For robotics, the fun part is also the hard part: the frame only matters if it resembles what the robot could actually see. Human factors, safety, environment design, and robotics experts should review any dataset meant to influence embodied behavior.
+
+
+
+
+
+
+
+
+
+
+
+### Airport security screening
+
+Airport baggage-screening is a place where realistic-looking examples are valuable but real data is hard to move around. Scans are sensitive, operationally constrained, and not something you casually paste into a notebook.
+
+A synthetic pipeline can vary scanner style, bag type, packing density, benign clutter, material mix, image quality, and high-level `threat_type` labels.
+
+The point is not to describe bypass tactics. The useful workflow is defensive: scanner-like images for model evaluation, human-review training, curriculum data, and visual QA around whether a bag should receive secondary review. The same generate-inspect-judge loop applies, with extra care around what the prompts should and should not encode.
+
+
+
+
+
+
+
+
+
+
+
+### Medical imaging
+
+This is where we slow down. The X-ray example uses the same Data Designer shape, but with extra caution. It samples a synthetic patient persona, anatomical region, view, equipment type, imaging context, exposure quality, positioning, primary finding, secondary findings, and image quality. The outputs are for AI research, education, data-pipeline prototyping, and evaluation workflows. They are not diagnostic images.
+
+
+These are synthetic radiograph-style images. They are not real medical images, are not diagnostic, and must not be used for clinical decision-making. Medical domain experts should review any medical-imaging distribution, prompt, label schema, and generated output before it is used in a serious workflow.
+
+
+The appeal here is distribution design. Real medical data is often scarce, private, and skewed toward common findings. A generated dataset can deliberately oversample rare fracture patterns, technical artifacts, positioning issues, or demographic slices, then attach ground-truth metadata because the pipeline produced the image in the first place.
+
+The next obvious columns are not image columns at all: radiology reports, severity scores, anatomical-consistency validators, and judge columns. That is the point. Once the image is a row, the rest of the Data Designer stack can work around it.
+
+
+
+
+
+
+
+
+
+
+
+### Agriculture
+
+Agriculture became more compelling once we framed it as crop disease detection instead of generic crop imagery. Now the controls have a job: vary crop type, growth stage, viewpoint, disease or confounding condition, severity, field condition, and weather. The output can help teams evaluate prompts, calibrate reviewers, or design a disease-labeling taxonomy before moving to governed field imagery.
+
+Crop pathology is its own discipline, so generated crop-disease images should be reviewed by agronomists or crop-health experts before they shape field decisions, label taxonomies, or benchmark claims.
+
+
+
+
+
+
+
+
+
+
+
+### Drone aerial inspection
+
+Drone aerial inspection gives you controlled, asset-level views that are wider than ground photos but still specific enough to ask about roof condition, bridge decks, culverts, pipelines, construction progress, storm damage, debris, and access paths.
+
+For the examples here, the prompt asks for clean raw drone frames: no HUDs, no boxes, no callouts, no labels, and no inspection-report overlays. That makes the images better suited for VLM evaluation because the model has to rely on the scene instead of helpful text stamped on top of it.
+
+Inspection imagery can drift into real-world safety, insurance, and compliance decisions quickly. Synthetic drone examples are best treated as review and evaluation fixtures unless qualified inspectors or domain teams approve the task design.
+
+
+
+
+
+
+
+
+
+
+
+### Funny pet image edits 🐾
+
+And then there is the important scientific frontier of making pets slightly more ridiculous.
+
+This one is intentionally playful, but the pattern is the same as the product-editing examples: generate a controlled reference image, pass it forward through `ImageContext`, and sample edit conditions that change the scene while preserving the subject.
+
+The controls are not "make funny image" as one giant prompt. They are ordinary columns: pet type, breed, age, activity, setting, expression, base photo style, comedy goal, prop, scene escalation, and humor style. The breed and age columns are conditioned on `pet_type` with `SubcategorySamplerParams`, so dog rows sample dog breeds and dog age bands, while cat rows sample cat breeds and cat age bands.
+
+It is useful for creative-review workflows, image-editing demos, identity-preservation evaluation, safety checks around benign edits, and human preference exercises. It also makes the dev note less solemn, which feels only fair after X-rays, baggage scans, crop disease, and drone inspection.
+
+```python
+config_builder.add_column(
+ dd.ImageColumnConfig(
+ name="funny_pet_image",
+ prompt=FUNNY_PET_EDIT_PROMPT,
+ model_alias="funny-pet-image-model",
+ multi_modal_context=[dd.ImageContext(column_name="base_pet_image")],
+ )
+)
+```
+
+
+
+
+
+
+
+
+
+
+
+## Other Use Cases Worth Building
+
+There are plenty of other production-relevant lanes to build out: manufacturing inspection, insurance claims, enterprise UI screenshots, packaging and compliance labels, industrial safety, education and science diagrams, e-commerce imagery, and accessibility datasets. The world really is your oyster!
+
+Those are better as focused follow-up recipes. Each one needs a real distribution to control, a failure mode to isolate, and a reviewer or model behavior to measure.
+
+Some domains need subject-matter review, policy constraints, or explicit disclaimers. That does not make them off-limits for synthetic data work. It means the bar for distribution design and review is higher.
+
+## The Practical Loop
+
+The workflow I keep coming back to is small and repeatable: declare the distribution you want, generate a tiny preview, inspect images beside their metadata, add VLM checks or human review notes, revise the prompts, then scale with `create()`.
+
+`preview()` is where you find out whether your idea survives contact with the model. `create()` writes the artifact-backed dataset: image files live with the dataset, dataframe rows keep references and metadata, downstream columns can use generated images as context, and judges or validators can score the outputs. The result is not just "more images." It is a dataset you can reason about.
+
+## Try It
+
+Start with the official image tutorials:
+
+- [Generating Images](/tutorials/generating-images)
+- [Providing Images as Context](/tutorials/providing-images-as-context)
+- [Image-to-Image Editing](/tutorials/image-to-image-editing)
+- [Rich Document Image Generation](/recipes/image-generation/rich-document-image-generation)
+- [Product Image Variations](/recipes/image-generation/product-image-variations)
+- [Autonomous Vehicle Traffic Scenarios](/recipes/image-generation/autonomous-vehicle-traffic-scenarios)
+- [Synthetic Extremity X-rays](/recipes/image-generation/synthetic-extremity-x-rays)
+- [Airport Baggage Screening Scans](/recipes/image-generation/airport-baggage-screening-scans)
+- [Humanoid Robot Scene Understanding](/recipes/image-generation/humanoid-robot-scene-understanding)
+- [Crop Disease Detection Images](/recipes/image-generation/crop-disease-detection-images)
+- [Drone Aerial Inspection](/recipes/image-generation/drone-aerial-inspection)
+- [Funny Pet Image Edits](/recipes/image-generation/funny-pet-image-edits)
+
+For the full PyData workshop arc, see the [workshop repository](https://github.com/nabinchha/pydata-london-2026-data-designer-anonymizer) and the [session recording](https://www.youtube.com/watch?v=QSpVpNcDqso&list=PLGVZCDnMOq0rFQykYJg7t441AEpN4SszE&index=5).
diff --git a/fern/versions/latest/pages/recipes/cards.mdx b/fern/versions/latest/pages/recipes/cards.mdx
index 10308f7f4..9f33e1b49 100644
--- a/fern/versions/latest/pages/recipes/cards.mdx
+++ b/fern/versions/latest/pages/recipes/cards.mdx
@@ -14,6 +14,56 @@ Recipes are a collection of code examples that demonstrate how to leverage Data
These recipes use the OpenAI model provider by default. Ensure your OpenAI provider is set up via the Data Designer CLI before running a recipe.
+## Image Generation
+
+
+
+ Generate synthetic business-document page images with controlled metadata for VQA, OCR, multimodal judging, and document-understanding workflows.
+
+ *Image generation · visual seed data · VQA-ready parquet export*
+
+
+ Use image-to-image generation to create inclusive adult apparel catalog variants across age groups, ethnicities, body types, poses, and styling contexts.
+
+ *Image-to-image · apparel catalog · inclusive representation*
+
+
+ Generate synthetic dog and cat photos, then use image-to-image generation to make the same pet scene funnier while preserving identity.
+
+ *Image-to-image · creative review · identity preservation*
+
+
+ Generate self-driving car ego-camera scenes with controlled road, weather, lighting, traffic, and long-tail hazard variation.
+
+ *AV ego camera · edge cases · visual review sets*
+
+
+ Generate research-only extremity X-ray style images with controlled anatomy, acquisition, finding, and quality metadata.
+
+ *Research only · visual QA · report generation*
+
+
+ Generate defensive baggage-screening style images with controlled clutter, material mix, scanner style, and review labels.
+
+ *Defensive evaluation · human review · scanner-like images*
+
+
+ Generate egocentric humanoid robot scenes with controlled environment, viewpoint, task, object, safety, lighting, and human-presence metadata.
+
+ *Embodied AI · scene understanding · safety review*
+
+
+ Generate crop disease detection images with controlled crop, growth stage, viewpoint, condition, severity, and field context.
+
+ *Crop disease detection · healthy negatives · reviewer calibration*
+
+
+ Generate low-altitude drone inspection images for infrastructure, property, construction, disaster-response, and industrial review workflows.
+
+ *Drone inspection · infrastructure QA · reviewer calibration*
+
+
+
## Code Generation
diff --git a/fern/versions/latest/pages/recipes/image_generation/agriculture_crop_imagery.mdx b/fern/versions/latest/pages/recipes/image_generation/agriculture_crop_imagery.mdx
new file mode 100644
index 000000000..daead9d20
--- /dev/null
+++ b/fern/versions/latest/pages/recipes/image_generation/agriculture_crop_imagery.mdx
@@ -0,0 +1,298 @@
+---
+title: "Crop Disease Detection Images"
+description: "Generate synthetic crop disease detection images with controlled crop, growth stage, viewpoint, disease condition, severity, and field-condition metadata."
+---
+
+ [Download the complete recipe script](https://github.com/NVIDIA-NeMo/DataDesigner/blob/main/fern/assets/recipes/image_generation/agriculture_crop_imagery.py)
+
+
+```python
+# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+# /// script
+# requires-python = ">=3.10"
+# dependencies = [
+# "data-designer",
+# ]
+# ///
+"""Agriculture Crop Disease Detection Image Recipe
+
+Generate synthetic crop disease detection images with controlled variation over
+crop type, growth stage, viewpoint, disease or confounding condition, severity,
+weather, irrigation, and field condition. The objective is to create examples
+where the expected crop-health label is known, including healthy negatives and
+hard confounders, so teams can evaluate detection prompts, build labeling
+rubrics, calibrate reviewers, and prototype crop-disease workflows before using
+governed field imagery.
+
+Prerequisites:
+ - An image-generation provider key for the selected model. The defaults use
+ OpenRouter, so set OPENROUTER_API_KEY before running.
+
+Run:
+ uv run agriculture_crop_imagery.py --num-records 10
+"""
+
+from __future__ import annotations
+
+import argparse
+from pathlib import Path
+
+import data_designer.config as dd
+from data_designer.interface import DataDesigner, DatasetCreationResults
+
+DEFAULT_MODEL_PROVIDER = "openrouter"
+DEFAULT_MODEL_ID = "google/gemini-3.1-flash-image-preview"
+DEFAULT_MODEL_ALIAS = "agriculture-image-model"
+
+
+def build_model_configs(
+ *,
+ model_provider: str,
+ model_id: str,
+ model_alias: str,
+ image_size: str,
+ aspect_ratio: str,
+ max_parallel_requests: int,
+) -> list[dd.ModelConfig]:
+ return [
+ dd.ModelConfig(
+ alias=model_alias,
+ model=model_id,
+ provider=model_provider,
+ inference_parameters=dd.ImageInferenceParams(
+ extra_body={
+ "n": 1,
+ "generationConfig": {
+ "imageConfig": {
+ "aspectRatio": aspect_ratio,
+ "imageSize": image_size,
+ }
+ },
+ },
+ max_parallel_requests=max_parallel_requests,
+ ),
+ skip_health_check=True,
+ )
+ ]
+
+
+def add_category(config_builder: dd.DataDesignerConfigBuilder, name: str, values: list[str]) -> None:
+ config_builder.add_column(
+ dd.SamplerColumnConfig(
+ name=name,
+ sampler_type=dd.SamplerType.CATEGORY,
+ params=dd.CategorySamplerParams(values=values),
+ )
+ )
+
+
+def add_visual_variation_id(config_builder: dd.DataDesignerConfigBuilder) -> None:
+ """Add a unique row-level key that discourages duplicate image generations."""
+ config_builder.add_column(
+ dd.SamplerColumnConfig(
+ name="visual_variation_id",
+ sampler_type=dd.SamplerType.UUID,
+ params=dd.UUIDSamplerParams(prefix="crop-", short_form=True),
+ )
+ )
+
+
+def build_config(
+ *,
+ model_provider: str = DEFAULT_MODEL_PROVIDER,
+ model_id: str = DEFAULT_MODEL_ID,
+ model_alias: str = DEFAULT_MODEL_ALIAS,
+ image_size: str = "1024",
+ aspect_ratio: str = "4:3",
+ max_parallel_requests: int = 10,
+) -> dd.DataDesignerConfigBuilder:
+ model_configs = build_model_configs(
+ model_provider=model_provider,
+ model_id=model_id,
+ model_alias=model_alias,
+ image_size=image_size,
+ aspect_ratio=aspect_ratio,
+ max_parallel_requests=max_parallel_requests,
+ )
+ config_builder = dd.DataDesignerConfigBuilder(model_configs=model_configs)
+ add_visual_variation_id(config_builder)
+
+ add_category(
+ config_builder,
+ "crop_type",
+ [
+ "corn",
+ "soybean",
+ "wheat",
+ "rice",
+ "tomato",
+ "grape vineyard",
+ "apple orchard",
+ "lettuce",
+ "potato",
+ "strawberry",
+ ],
+ )
+ add_category(
+ config_builder,
+ "growth_stage",
+ [
+ "seedling",
+ "vegetative growth",
+ "flowering",
+ "fruiting",
+ "grain fill",
+ "near harvest",
+ ],
+ )
+ add_category(
+ config_builder,
+ "viewpoint",
+ [
+ "close-up leaf-level scouting photo",
+ "row-level field photo",
+ "drone oblique field view",
+ "top-down drone crop-row view",
+ "greenhouse bench view",
+ "orchard row view",
+ ],
+ )
+ add_category(
+ config_builder,
+ "disease_or_condition",
+ [
+ "powdery mildew on leaves",
+ "rust-colored fungal pustules on leaf surfaces",
+ "early blight with concentric brown leaf spots",
+ "late blight with irregular dark lesions",
+ "bacterial leaf spot with small dark speckles",
+ "downy mildew patches on leaf undersides",
+ "leaf curl with mosaic discoloration",
+ "insect feeding damage as a disease confounder",
+ "nutrient deficiency yellowing as a disease confounder",
+ ],
+ )
+ add_category(
+ config_builder,
+ "severity",
+ [
+ "low severity affecting isolated plants",
+ "moderate severity affecting patches",
+ "high severity affecting large field sections",
+ ],
+ )
+ add_category(
+ config_builder,
+ "field_condition",
+ [
+ "uniform crop stand",
+ "patchy emergence",
+ "uneven row spacing",
+ "visible irrigation lines",
+ "muddy soil after rain",
+ "dry cracked soil",
+ "mulched bed system",
+ ],
+ )
+ add_category(
+ config_builder,
+ "weather_lighting",
+ [
+ "bright midday sun",
+ "soft overcast light",
+ "golden hour light",
+ "after-rain humid conditions",
+ "hazy smoky sky",
+ "greenhouse diffuse lighting",
+ ],
+ )
+
+ config_builder.add_column(
+ dd.ImageColumnConfig(
+ name="crop_image",
+ prompt=AGRICULTURE_IMAGE_PROMPT,
+ model_alias=model_alias,
+ )
+ )
+
+ return config_builder
+
+
+def create_dataset(
+ config_builder: dd.DataDesignerConfigBuilder,
+ *,
+ num_records: int,
+ dataset_name: str,
+ artifact_path: Path | str | None = None,
+) -> DatasetCreationResults:
+ data_designer = DataDesigner(artifact_path=artifact_path)
+ data_designer.validate(config_builder)
+ return data_designer.create(config_builder, num_records=num_records, dataset_name=dataset_name)
+
+
+AGRICULTURE_IMAGE_PROMPT = """\
+Create a realistic crop disease detection image.
+
+Scene requirements:
+- Visual variation ID, for internal diversity only: {{ visual_variation_id }}
+- Crop type: {{ crop_type }}
+- Growth stage: {{ growth_stage }}
+- Viewpoint: {{ viewpoint }}
+- Disease or condition: {{ disease_or_condition }}
+- Severity: {{ severity }}
+- Field condition: {{ field_condition }}
+- Weather and lighting: {{ weather_lighting }}
+
+Make the image useful for crop disease detection, visual QA, reviewer
+calibration, and data-labeling experiments. The requested crop, condition,
+severity, and field context should be visually inspectable. Show realistic
+plant structure, leaves, rows, soil, and disease symptoms when requested. For
+healthy examples, show clear healthy leaves or canopy with no visible disease.
+For confounders, make the non-disease condition plausible enough to test a
+classifier or VLM prompt. Do not include real farm names, readable license
+plates, watermarks, or people as the primary subject. Generate exactly one
+final crop image for this row. Do not return alternate versions, a grid, a pair
+of examples, before/after panels, or multiple frames. Use the visual variation
+ID only as an internal diversity key; never render it as text.
+"""
+
+
+def parse_args() -> argparse.Namespace:
+ parser = argparse.ArgumentParser(description="Generate synthetic crop disease detection imagery.")
+ parser.add_argument("--num-records", type=int, default=10, help="Number of crop images to generate.")
+ parser.add_argument("--dataset-name", default="crop-disease-detection-images", help="Output dataset name.")
+ parser.add_argument("--artifact-path", type=Path, default=None, help="Optional Data Designer artifact directory.")
+ parser.add_argument("--model-provider", default=DEFAULT_MODEL_PROVIDER, help="Image model provider name.")
+ parser.add_argument("--model-id", default=DEFAULT_MODEL_ID, help="Provider model ID.")
+ parser.add_argument("--model-alias", default=DEFAULT_MODEL_ALIAS, help="Alias used by image columns.")
+ parser.add_argument("--image-size", default="1024", help="Provider-specific image size value.")
+ parser.add_argument("--aspect-ratio", default="4:3", help="Provider-specific aspect ratio value.")
+ parser.add_argument("--max-parallel-requests", type=int, default=10, help="Maximum parallel image requests.")
+ return parser.parse_args()
+
+
+def main() -> None:
+ args = parse_args()
+ config_builder = build_config(
+ model_provider=args.model_provider,
+ model_id=args.model_id,
+ model_alias=args.model_alias,
+ image_size=args.image_size,
+ aspect_ratio=args.aspect_ratio,
+ max_parallel_requests=args.max_parallel_requests,
+ )
+ results = create_dataset(
+ config_builder,
+ num_records=args.num_records,
+ dataset_name=args.dataset_name,
+ artifact_path=args.artifact_path,
+ )
+ dataset = results.load_dataset()
+ print(f"Generated {len(dataset)} crop disease detection image rows.")
+ print(f"Dataset artifacts: {results.artifact_storage.base_dataset_path}")
+
+
+if __name__ == "__main__":
+ main()
+```
diff --git a/fern/versions/latest/pages/recipes/image_generation/airport_security_scans.mdx b/fern/versions/latest/pages/recipes/image_generation/airport_security_scans.mdx
new file mode 100644
index 000000000..96b2b4864
--- /dev/null
+++ b/fern/versions/latest/pages/recipes/image_generation/airport_security_scans.mdx
@@ -0,0 +1,309 @@
+---
+title: "Airport Baggage Screening Scans"
+description: "Generate defensive synthetic baggage-screening style images with controlled clutter, material mix, scanner style, and threat-type labels."
+---
+
+ [Download the complete recipe script](https://github.com/NVIDIA-NeMo/DataDesigner/blob/main/fern/assets/recipes/image_generation/airport_security_scans.py)
+
+
+```python
+# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+# /// script
+# requires-python = ">=3.10"
+# dependencies = [
+# "data-designer",
+# ]
+# ///
+"""Airport Baggage Screening Image Generation Recipe
+
+Generate synthetic airport baggage-screening style images with controlled
+variation over scanner style, bag density, benign clutter, material mix, object
+overlap, and high-level defensive threat-type labels.
+
+Security note:
+ This recipe is intended for defensive model development, evaluation,
+ curriculum data, and human-review tooling. Do not use it to plan, optimize,
+ or describe ways to bypass real screening systems. The prompts avoid
+ operational bypass details and use high-level threat types rather than
+ concealment instructions.
+
+Prerequisites:
+ - An image-generation provider key for the selected model. The defaults use
+ OpenRouter, so set OPENROUTER_API_KEY before running.
+
+Run:
+ uv run airport_security_scans.py --num-records 10
+"""
+
+from __future__ import annotations
+
+import argparse
+from pathlib import Path
+
+import data_designer.config as dd
+from data_designer.interface import DataDesigner, DatasetCreationResults
+
+DEFAULT_MODEL_PROVIDER = "openrouter"
+DEFAULT_MODEL_ID = "google/gemini-3.1-flash-image-preview"
+DEFAULT_MODEL_ALIAS = "baggage-screening-image-model"
+
+
+def build_model_configs(
+ *,
+ model_provider: str,
+ model_id: str,
+ model_alias: str,
+ image_size: str,
+ aspect_ratio: str,
+ max_parallel_requests: int,
+) -> list[dd.ModelConfig]:
+ """Build a provider-agnostic image-generation model config."""
+ return [
+ dd.ModelConfig(
+ alias=model_alias,
+ model=model_id,
+ provider=model_provider,
+ inference_parameters=dd.ImageInferenceParams(
+ extra_body={
+ "n": 1,
+ "generationConfig": {
+ "imageConfig": {
+ "aspectRatio": aspect_ratio,
+ "imageSize": image_size,
+ }
+ },
+ },
+ max_parallel_requests=max_parallel_requests,
+ ),
+ skip_health_check=True,
+ )
+ ]
+
+
+def add_category(config_builder: dd.DataDesignerConfigBuilder, name: str, values: list[str]) -> None:
+ """Add a categorical sampler column."""
+ config_builder.add_column(
+ dd.SamplerColumnConfig(
+ name=name,
+ sampler_type=dd.SamplerType.CATEGORY,
+ params=dd.CategorySamplerParams(values=values),
+ )
+ )
+
+
+def add_visual_variation_id(config_builder: dd.DataDesignerConfigBuilder) -> None:
+ """Add a unique row-level key that discourages duplicate image generations."""
+ config_builder.add_column(
+ dd.SamplerColumnConfig(
+ name="visual_variation_id",
+ sampler_type=dd.SamplerType.UUID,
+ params=dd.UUIDSamplerParams(prefix="scan-", short_form=True),
+ )
+ )
+
+
+def build_config(
+ *,
+ model_provider: str = DEFAULT_MODEL_PROVIDER,
+ model_id: str = DEFAULT_MODEL_ID,
+ model_alias: str = DEFAULT_MODEL_ALIAS,
+ image_size: str = "1024",
+ aspect_ratio: str = "4:3",
+ max_parallel_requests: int = 10,
+) -> dd.DataDesignerConfigBuilder:
+ """Build an airport baggage-screening image-generation pipeline."""
+ model_configs = build_model_configs(
+ model_provider=model_provider,
+ model_id=model_id,
+ model_alias=model_alias,
+ image_size=image_size,
+ aspect_ratio=aspect_ratio,
+ max_parallel_requests=max_parallel_requests,
+ )
+ config_builder = dd.DataDesignerConfigBuilder(model_configs=model_configs)
+ add_visual_variation_id(config_builder)
+
+ add_category(
+ config_builder,
+ "scanner_style",
+ [
+ "dual-energy X-ray baggage scan with pseudo-color material mapping",
+ "computed tomography baggage scan slice rendered as pseudo-color X-ray",
+ "top-down carry-on baggage screening view",
+ "side-view checked-bag screening image",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "bag_type",
+ [
+ "small carry-on roller bag",
+ "soft backpack",
+ "messenger bag",
+ "hard-shell suitcase",
+ "duffel bag",
+ "camera equipment case",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "bag_density",
+ [
+ "sparse packing with many empty regions",
+ "moderate packing density",
+ "dense packing with overlapping objects",
+ "very dense packing with cluttered object boundaries",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "benign_contents",
+ [
+ "clothing, shoes, toiletries, and paperback books",
+ "laptop, chargers, headphones, notebooks, and snacks",
+ "camera body, lenses, batteries, cables, and clothing",
+ "children's toys, folded clothing, tablet, and water bottle",
+ "sports gear, towel, shoes, and plastic accessories",
+ "business travel items, documents, laptop, and power adapters",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "material_mix",
+ [
+ "mostly fabric and plastic with a few small metal objects",
+ "electronics-heavy bag with cables and batteries",
+ "mixed organic, plastic, and metal materials",
+ "mostly low-density organic material with scattered dense regions",
+ "many small overlapping metal and plastic objects",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "threat_type",
+ [
+ "none - clear benign bag with no threat-like visual pattern",
+ "dense electronics cluster requiring secondary review",
+ "oversized liquid-container-like region requiring secondary review",
+ "sharp-object-like silhouette requiring secondary review",
+ "unknown dense object requiring secondary review",
+ "clutter and overlapping objects preventing confident clearance",
+ "organic anomaly requiring secondary review",
+ "ambiguous tool-like silhouette requiring secondary review",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "image_quality",
+ [
+ "clean scanner output with crisp object boundaries",
+ "slightly noisy scanner output",
+ "low-contrast scan with compressed dynamic range",
+ "scan with mild motion blur",
+ "scan with color palette shifted toward orange and blue material classes",
+ ],
+ )
+
+ config_builder.add_column(
+ dd.ImageColumnConfig(
+ name="baggage_scan",
+ prompt=AIRPORT_SECURITY_SCAN_PROMPT,
+ model_alias=model_alias,
+ )
+ )
+
+ return config_builder
+
+
+def create_dataset(
+ config_builder: dd.DataDesignerConfigBuilder,
+ *,
+ num_records: int,
+ dataset_name: str,
+ artifact_path: Path | str | None = None,
+) -> DatasetCreationResults:
+ data_designer = DataDesigner(artifact_path=artifact_path)
+ data_designer.validate(config_builder)
+ return data_designer.create(config_builder, num_records=num_records, dataset_name=dataset_name)
+
+
+AIRPORT_SECURITY_SCAN_PROMPT = """\
+Create a synthetic airport baggage-screening training image that shows only the scan content.
+
+Image requirements:
+- Visual variation ID, for internal diversity only: {{ visual_variation_id }}
+- Scanner style: {{ scanner_style }}
+- Bag type: {{ bag_type }}
+- Bag density: {{ bag_density }}
+- Benign contents: {{ benign_contents }}
+- Material mix: {{ material_mix }}
+- Threat type metadata target, not text to render: {{ threat_type }}
+- Image quality: {{ image_quality }}
+
+Render the image as a realistic pseudo-color baggage scan, not a normal photo.
+Show overlapping objects, material-color variation, partial occlusion, and
+scanner-like attenuation. The image should be useful for defensive model
+development and human-review training.
+
+Generate exactly one final scan image for this row. Do not return alternate
+versions, a grid, a pair of examples, a before/after image, multiple scans, or
+multiple panels. Use the visual variation ID only as an internal diversity key
+for object placement, scanner angle, and material pattern; never render it as
+text.
+
+The output must be the scan image only. Do not add labels, legends, captions,
+classification text, bounding boxes, arrows, callouts, segmentation overlays,
+heatmaps, UI panels, scanner controls, watermarks, timestamps, filenames, row
+IDs, colored outlines, or any additional layer of text. Do not include
+operational airport details, real airport names, passenger names, barcodes,
+boarding passes, bypass instructions, or anything that describes how to hide or
+evade detection. Use the threat type only to shape the broad visual contents of
+the bag scan.
+"""
+
+
+def parse_args() -> argparse.Namespace:
+ parser = argparse.ArgumentParser(description="Generate synthetic airport baggage-screening images.")
+ parser.add_argument("--num-records", type=int, default=10, help="Number of baggage scan images to generate.")
+ parser.add_argument("--dataset-name", default="synthetic-baggage-screening-scans", help="Output dataset name.")
+ parser.add_argument("--artifact-path", type=Path, default=None, help="Optional Data Designer artifact directory.")
+ parser.add_argument("--model-provider", default=DEFAULT_MODEL_PROVIDER, help="Image model provider name.")
+ parser.add_argument("--model-id", default=DEFAULT_MODEL_ID, help="Provider model ID.")
+ parser.add_argument("--model-alias", default=DEFAULT_MODEL_ALIAS, help="Alias used by image columns.")
+ parser.add_argument("--image-size", default="1024", help="Provider-specific image size value.")
+ parser.add_argument("--aspect-ratio", default="4:3", help="Provider-specific aspect ratio value.")
+ parser.add_argument("--max-parallel-requests", type=int, default=10, help="Maximum parallel image requests.")
+ return parser.parse_args()
+
+
+def main() -> None:
+ args = parse_args()
+ config_builder = build_config(
+ model_provider=args.model_provider,
+ model_id=args.model_id,
+ model_alias=args.model_alias,
+ image_size=args.image_size,
+ aspect_ratio=args.aspect_ratio,
+ max_parallel_requests=args.max_parallel_requests,
+ )
+ results = create_dataset(
+ config_builder,
+ num_records=args.num_records,
+ dataset_name=args.dataset_name,
+ artifact_path=args.artifact_path,
+ )
+ dataset = results.load_dataset()
+ print(f"Generated {len(dataset)} synthetic baggage-screening rows.")
+ print(f"Dataset artifacts: {results.artifact_storage.base_dataset_path}")
+
+
+if __name__ == "__main__":
+ main()
+```
diff --git a/fern/versions/latest/pages/recipes/image_generation/drone_aerial_inspection.mdx b/fern/versions/latest/pages/recipes/image_generation/drone_aerial_inspection.mdx
new file mode 100644
index 000000000..147013eae
--- /dev/null
+++ b/fern/versions/latest/pages/recipes/image_generation/drone_aerial_inspection.mdx
@@ -0,0 +1,318 @@
+---
+title: "Drone Aerial Inspection"
+description: "Generate synthetic low-altitude drone inspection images with controlled site, target, altitude, camera angle, defect, severity, occlusion, and lighting metadata."
+---
+
+ [Download the complete recipe script](https://github.com/NVIDIA-NeMo/DataDesigner/blob/main/fern/assets/recipes/image_generation/drone_aerial_inspection.py)
+
+
+```python
+# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+# /// script
+# requires-python = ">=3.10"
+# dependencies = [
+# "data-designer",
+# ]
+# ///
+"""Drone Aerial Inspection Image Generation Recipe
+
+Generate synthetic low-altitude drone inspection images with controlled
+variation over site type, inspection target, altitude, camera angle, defect or
+event, severity, occlusion, lighting, and surface condition.
+
+The recipe is intended for infrastructure inspection, property review,
+construction monitoring, disaster-response review, visual QA, reviewer
+calibration, and VLM evaluation. It avoids surveillance, military targeting,
+evasion, or sensitive-facility prompts.
+
+Prerequisites:
+ - An image-generation provider key for the selected model. The defaults use
+ OpenRouter, so set OPENROUTER_API_KEY before running.
+
+Run:
+ uv run drone_aerial_inspection.py --num-records 10
+"""
+
+from __future__ import annotations
+
+import argparse
+from pathlib import Path
+
+import data_designer.config as dd
+from data_designer.interface import DataDesigner, DatasetCreationResults
+
+DEFAULT_MODEL_PROVIDER = "openrouter"
+DEFAULT_MODEL_ID = "google/gemini-3.1-flash-image-preview"
+DEFAULT_MODEL_ALIAS = "drone-aerial-inspection-model"
+
+
+def build_model_configs(
+ *,
+ model_provider: str,
+ model_id: str,
+ model_alias: str,
+ image_size: str,
+ aspect_ratio: str,
+ max_parallel_requests: int,
+) -> list[dd.ModelConfig]:
+ return [
+ dd.ModelConfig(
+ alias=model_alias,
+ model=model_id,
+ provider=model_provider,
+ inference_parameters=dd.ImageInferenceParams(
+ extra_body={
+ "n": 1,
+ "generationConfig": {
+ "imageConfig": {
+ "aspectRatio": aspect_ratio,
+ "imageSize": image_size,
+ }
+ },
+ },
+ max_parallel_requests=max_parallel_requests,
+ ),
+ skip_health_check=True,
+ )
+ ]
+
+
+def add_category(config_builder: dd.DataDesignerConfigBuilder, name: str, values: list[str]) -> None:
+ config_builder.add_column(
+ dd.SamplerColumnConfig(
+ name=name,
+ sampler_type=dd.SamplerType.CATEGORY,
+ params=dd.CategorySamplerParams(values=values),
+ )
+ )
+
+
+def add_visual_variation_id(config_builder: dd.DataDesignerConfigBuilder) -> None:
+ """Add a unique row-level key that discourages duplicate image generations."""
+ config_builder.add_column(
+ dd.SamplerColumnConfig(
+ name="visual_variation_id",
+ sampler_type=dd.SamplerType.UUID,
+ params=dd.UUIDSamplerParams(prefix="drone-", short_form=True),
+ )
+ )
+
+
+def build_config(
+ *,
+ model_provider: str = DEFAULT_MODEL_PROVIDER,
+ model_id: str = DEFAULT_MODEL_ID,
+ model_alias: str = DEFAULT_MODEL_ALIAS,
+ image_size: str = "1024",
+ aspect_ratio: str = "16:9",
+ max_parallel_requests: int = 10,
+) -> dd.DataDesignerConfigBuilder:
+ model_configs = build_model_configs(
+ model_provider=model_provider,
+ model_id=model_id,
+ model_alias=model_alias,
+ image_size=image_size,
+ aspect_ratio=aspect_ratio,
+ max_parallel_requests=max_parallel_requests,
+ )
+ config_builder = dd.DataDesignerConfigBuilder(model_configs=model_configs)
+ add_visual_variation_id(config_builder)
+
+ add_category(
+ config_builder,
+ "site_type",
+ [
+ "residential roof and yard",
+ "commercial flat roof",
+ "bridge deck and support structure",
+ "rail corridor and track bed",
+ "solar farm rows",
+ "wind turbine tower and blades",
+ "construction site",
+ "roadway and drainage culvert",
+ "utility pipeline corridor",
+ "storm-affected neighborhood street",
+ ],
+ )
+ add_category(
+ config_builder,
+ "inspection_target",
+ [
+ "roof covering condition",
+ "surface cracking",
+ "standing water",
+ "vegetation encroachment",
+ "panel or blade damage",
+ "debris blocking access",
+ "material staging progress",
+ "erosion around infrastructure",
+ "storm or hail damage",
+ "construction progress milestone",
+ ],
+ )
+ add_category(
+ config_builder,
+ "altitude",
+ [
+ "very low drone pass, about 10 meters above the target",
+ "low drone pass, about 25 meters above the target",
+ "medium drone pass, about 60 meters above the target",
+ "higher overview pass, about 100 meters above the target",
+ ],
+ )
+ add_category(
+ config_builder,
+ "camera_angle",
+ [
+ "straight-down nadir view",
+ "oblique 45-degree inspection angle",
+ "shallow side-looking pass",
+ "close detail view with wide-angle lens",
+ "overview frame with the target centered",
+ ],
+ )
+ add_category(
+ config_builder,
+ "defect_or_event",
+ [
+ "no visible issue, normal baseline condition",
+ "small crack or seam separation",
+ "moderate staining or water pooling",
+ "missing roof shingles or damaged surface panels",
+ "debris scattered across the inspection area",
+ "vegetation growth obscuring part of the asset",
+ "erosion or washout near an edge",
+ "construction material staged in the wrong zone",
+ "storm damage with displaced objects",
+ "surface discoloration that may be benign",
+ ],
+ )
+ add_category(
+ config_builder,
+ "severity",
+ [
+ "none",
+ "minor and easy to miss",
+ "moderate and localized",
+ "severe and clearly visible",
+ ],
+ )
+ add_category(
+ config_builder,
+ "occlusion",
+ [
+ "clear unobstructed view",
+ "partially occluded by tree branches",
+ "partially occluded by shadows",
+ "partially occluded by temporary equipment",
+ "motion blur from drone movement",
+ ],
+ )
+ add_category(
+ config_builder,
+ "weather_lighting",
+ [
+ "bright midday sun",
+ "soft overcast light",
+ "golden hour light with long shadows",
+ "after-rain wet surfaces",
+ "hazy light with reduced contrast",
+ ],
+ )
+
+ config_builder.add_column(
+ dd.ImageColumnConfig(
+ name="drone_inspection_image",
+ prompt=DRONE_AERIAL_INSPECTION_PROMPT,
+ model_alias=model_alias,
+ )
+ )
+
+ return config_builder
+
+
+def create_dataset(
+ config_builder: dd.DataDesignerConfigBuilder,
+ *,
+ num_records: int,
+ dataset_name: str,
+ artifact_path: Path | str | None = None,
+) -> DatasetCreationResults:
+ data_designer = DataDesigner(artifact_path=artifact_path)
+ data_designer.validate(config_builder)
+ return data_designer.create(config_builder, num_records=num_records, dataset_name=dataset_name)
+
+
+DRONE_AERIAL_INSPECTION_PROMPT = """\
+Create a realistic low-altitude drone inspection image.
+
+Image requirements:
+- Visual variation ID, for internal diversity only: {{ visual_variation_id }}
+- Site type: {{ site_type }}
+- Inspection target: {{ inspection_target }}
+- Altitude: {{ altitude }}
+- Camera angle: {{ camera_angle }}
+- Defect or event: {{ defect_or_event }}
+- Severity: {{ severity }}
+- Occlusion: {{ occlusion }}
+- Weather and lighting: {{ weather_lighting }}
+
+Render the image as if captured by a civilian inspection drone, not a satellite
+or normal ground camera. Make the inspection target and requested defect,
+event, baseline condition, or confounder visible enough for visual QA,
+reviewer calibration, or VLM evaluation. Show realistic materials, shadows,
+scale, surfaces, construction context, vegetation, drainage, roof texture,
+panels, tracks, roads, or structural elements when requested.
+
+Render this as a clean raw drone camera frame. Do not include surveillance UI,
+inspection report graphics, HUD elements, map overlays, crosshairs, targeting
+reticles, bounding boxes, segmentation masks, heatmap colors, arrows, callouts,
+measurement graphics, labels, timestamps, coordinates, real place names,
+readable license plates, identifiable people, faces, watermarks, or any text
+overlay. Do not frame it as a military, police, or sensitive-facility image.
+Generate exactly one final drone inspection image for this row. Do not return
+alternate versions, a grid, a pair of examples, before/after panels, or multiple
+frames. Use the visual variation ID only as an internal diversity key; never
+render it as text.
+"""
+
+
+def parse_args() -> argparse.Namespace:
+ parser = argparse.ArgumentParser(description="Generate synthetic drone aerial inspection imagery.")
+ parser.add_argument("--num-records", type=int, default=10, help="Number of drone inspection images to generate.")
+ parser.add_argument("--dataset-name", default="drone-aerial-inspection", help="Output dataset name.")
+ parser.add_argument("--artifact-path", type=Path, default=None, help="Optional Data Designer artifact directory.")
+ parser.add_argument("--model-provider", default=DEFAULT_MODEL_PROVIDER, help="Image model provider name.")
+ parser.add_argument("--model-id", default=DEFAULT_MODEL_ID, help="Provider model ID.")
+ parser.add_argument("--model-alias", default=DEFAULT_MODEL_ALIAS, help="Alias used by image columns.")
+ parser.add_argument("--image-size", default="1024", help="Provider-specific image size value.")
+ parser.add_argument("--aspect-ratio", default="16:9", help="Provider-specific aspect ratio value.")
+ parser.add_argument("--max-parallel-requests", type=int, default=10, help="Maximum parallel image requests.")
+ return parser.parse_args()
+
+
+def main() -> None:
+ args = parse_args()
+ config_builder = build_config(
+ model_provider=args.model_provider,
+ model_id=args.model_id,
+ model_alias=args.model_alias,
+ image_size=args.image_size,
+ aspect_ratio=args.aspect_ratio,
+ max_parallel_requests=args.max_parallel_requests,
+ )
+ results = create_dataset(
+ config_builder,
+ num_records=args.num_records,
+ dataset_name=args.dataset_name,
+ artifact_path=args.artifact_path,
+ )
+ dataset = results.load_dataset()
+ print(f"Generated {len(dataset)} drone aerial inspection image rows.")
+ print(f"Dataset artifacts: {results.artifact_storage.base_dataset_path}")
+
+
+if __name__ == "__main__":
+ main()
+```
diff --git a/fern/versions/latest/pages/recipes/image_generation/funny_pet_image_edits.mdx b/fern/versions/latest/pages/recipes/image_generation/funny_pet_image_edits.mdx
new file mode 100644
index 000000000..df7734322
--- /dev/null
+++ b/fern/versions/latest/pages/recipes/image_generation/funny_pet_image_edits.mdx
@@ -0,0 +1,414 @@
+---
+title: "Funny Pet Image Edits"
+description: "Generate synthetic dog and cat photos, then use image-to-image generation to make the same pet scene funnier."
+---
+
+ [Download the complete recipe script](https://github.com/NVIDIA-NeMo/DataDesigner/blob/main/fern/assets/recipes/image_generation/funny_pet_image_edits.py)
+
+
+```python
+# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+# /// script
+# requires-python = ">=3.10"
+# dependencies = [
+# "data-designer",
+# ]
+# ///
+"""Funny Pet Image Editing Recipe
+
+Generate a base synthetic dog or cat image, then use image-to-image generation
+to make the same pet scene funnier while preserving the pet's identity.
+
+Use this as a playful example of text-to-image followed by image-to-image:
+generate a controlled reference image, edit it with additional sampled
+conditions, then keep both the image and metadata for visual QA, judge
+development, creative review, demos, and model-capability exploration.
+
+Prerequisites:
+ - An image-generation provider key for a model that supports image-to-image
+ editing through the chat-completions route. The defaults use OpenRouter
+ and Gemini 3.1 Flash Image Preview, so set OPENROUTER_API_KEY before running.
+
+Run:
+ uv run funny_pet_image_edits.py --num-records 10
+"""
+
+from __future__ import annotations
+
+import argparse
+from pathlib import Path
+
+import data_designer.config as dd
+from data_designer.interface import DataDesigner, DatasetCreationResults
+
+DEFAULT_MODEL_PROVIDER = "openrouter"
+DEFAULT_MODEL_ID = "google/gemini-3.1-flash-image-preview"
+DEFAULT_MODEL_ALIAS = "funny-pet-image-model"
+
+
+def build_model_configs(
+ *,
+ model_provider: str,
+ model_id: str,
+ model_alias: str,
+ image_size: str,
+ aspect_ratio: str,
+ max_parallel_requests: int,
+) -> list[dd.ModelConfig]:
+ """Build an image model config for text-to-image and image-to-image generation."""
+ return [
+ dd.ModelConfig(
+ alias=model_alias,
+ model=model_id,
+ provider=model_provider,
+ inference_parameters=dd.ImageInferenceParams(
+ extra_body={
+ "n": 1,
+ "generationConfig": {
+ "imageConfig": {
+ "aspectRatio": aspect_ratio,
+ "imageSize": image_size,
+ }
+ },
+ },
+ max_parallel_requests=max_parallel_requests,
+ ),
+ skip_health_check=True,
+ )
+ ]
+
+
+def add_category(config_builder: dd.DataDesignerConfigBuilder, name: str, values: list[str]) -> None:
+ """Add a categorical sampler column."""
+ config_builder.add_column(
+ dd.SamplerColumnConfig(
+ name=name,
+ sampler_type=dd.SamplerType.CATEGORY,
+ params=dd.CategorySamplerParams(values=values),
+ )
+ )
+
+
+def add_visual_variation_id(config_builder: dd.DataDesignerConfigBuilder) -> None:
+ """Add a unique row-level key that discourages duplicate image generations."""
+ config_builder.add_column(
+ dd.SamplerColumnConfig(
+ name="visual_variation_id",
+ sampler_type=dd.SamplerType.UUID,
+ params=dd.UUIDSamplerParams(prefix="pet-", short_form=True),
+ )
+ )
+
+
+def build_config(
+ *,
+ model_provider: str = DEFAULT_MODEL_PROVIDER,
+ model_id: str = DEFAULT_MODEL_ID,
+ model_alias: str = DEFAULT_MODEL_ALIAS,
+ image_size: str = "1024",
+ aspect_ratio: str = "4:3",
+ max_parallel_requests: int = 10,
+) -> dd.DataDesignerConfigBuilder:
+ """Build a funny pet text-to-image plus image-to-image pipeline."""
+ model_configs = build_model_configs(
+ model_provider=model_provider,
+ model_id=model_id,
+ model_alias=model_alias,
+ image_size=image_size,
+ aspect_ratio=aspect_ratio,
+ max_parallel_requests=max_parallel_requests,
+ )
+ config_builder = dd.DataDesignerConfigBuilder(model_configs=model_configs)
+ add_visual_variation_id(config_builder)
+
+ add_category(
+ config_builder,
+ "pet_type",
+ [
+ "dog",
+ "cat",
+ ],
+ )
+ config_builder.add_column(
+ dd.SamplerColumnConfig(
+ name="pet_breed",
+ sampler_type=dd.SamplerType.SUBCATEGORY,
+ params=dd.SubcategorySamplerParams(
+ category="pet_type",
+ values={
+ "dog": [
+ "German shepherd",
+ "golden retriever",
+ "Pembroke Welsh corgi",
+ "French bulldog",
+ "Shih Tzu",
+ "beagle",
+ "border collie",
+ "mixed-breed terrier",
+ ],
+ "cat": [
+ "orange tabby",
+ "black-and-white tuxedo cat",
+ "gray tabby",
+ "calico",
+ "Siamese cat",
+ "Maine Coon",
+ "British shorthair",
+ "domestic longhair",
+ ],
+ },
+ ),
+ )
+ )
+ config_builder.add_column(
+ dd.SamplerColumnConfig(
+ name="pet_age",
+ sampler_type=dd.SamplerType.SUBCATEGORY,
+ params=dd.SubcategorySamplerParams(
+ category="pet_type",
+ values={
+ "dog": [
+ "puppy, 6 to 18 months old",
+ "young adult dog, 1 to 4 years old",
+ "adult dog, 4 to 7 years old",
+ "senior dog, 8 years or older",
+ ],
+ "cat": [
+ "kitten, 4 to 12 months old",
+ "young adult cat, 1 to 4 years old",
+ "adult cat, 4 to 10 years old",
+ "senior cat, 11 years or older",
+ ],
+ },
+ ),
+ )
+ )
+ add_category(
+ config_builder,
+ "base_activity",
+ [
+ "sitting proudly at a small table",
+ "peeking over the edge of a sofa",
+ "standing on a kitchen chair",
+ "posing beside a cardboard box",
+ "lounging on a soft rug",
+ "looking directly at the camera with dramatic seriousness",
+ "balanced calmly beside a pile of toys",
+ ],
+ )
+ add_category(
+ config_builder,
+ "base_setting",
+ [
+ "sunny living room",
+ "cozy home office",
+ "tidy kitchen corner",
+ "soft studio backdrop",
+ "laundry room with folded towels",
+ "small apartment balcony with plants",
+ "quiet reading nook",
+ ],
+ )
+ add_category(
+ config_builder,
+ "pet_expression",
+ [
+ "deeply serious expression",
+ "wide-eyed confused expression",
+ "proud little smirk",
+ "sleepy but determined expression",
+ "mildly offended expression",
+ "curious head tilt",
+ ],
+ )
+ add_category(
+ config_builder,
+ "base_photo_style",
+ [
+ "natural phone photo with soft daylight",
+ "clean studio portrait with gentle shadows",
+ "warm editorial pet portrait",
+ "slightly low-angle comedic portrait",
+ "documentary-style candid photo",
+ ],
+ )
+ add_category(
+ config_builder,
+ "comedy_edit_goal",
+ [
+ "stage the pet as a tiny orchestra conductor for squeaky toys",
+ "stage the pet as a very serious chef inspecting a tiny bowl",
+ "stage the pet as a cardboard-spaceship pilot with abstract controls",
+ "stage the pet as a detective following a harmless trail of snack crumbs",
+ "stage the pet as a living-room sports champion with a tiny trophy",
+ "stage the pet as a tiny gardener supervising toy plants",
+ "stage the pet as a blanket-cape superhero in a cozy room",
+ "stage the pet as a toy stage performer under a tiny spotlight",
+ ],
+ )
+ add_category(
+ config_builder,
+ "funny_prop",
+ [
+ "tiny oversized glasses",
+ "miniature necktie",
+ "small chef hat",
+ "toy conductor baton",
+ "miniature trophy with no writing",
+ "blank tiny clipboard with no writing",
+ "cardboard rocket dashboard with colored circles only",
+ "toy magnifying glass",
+ "small paper crown",
+ "tiny blanket cape",
+ ],
+ )
+ add_category(
+ config_builder,
+ "scene_escalation",
+ [
+ "add a neatly arranged set of miniature props around the pet",
+ "add a playful spotlight and dramatic shadows",
+ "add a tiny stage setup made from household objects",
+ "add confetti-like paper shapes on the floor",
+ "add a pretend control panel made only of colored circles and blank buttons",
+ "add an audience of plush toys in the background",
+ "add a whimsical but tidy tabletop set",
+ "add toy vegetables, an empty bowl, and a tiny spoon",
+ "add squeaky toys arranged like an orchestra",
+ ],
+ )
+ add_category(
+ config_builder,
+ "humor_style",
+ [
+ "deadpan absurdity",
+ "cozy wholesome comedy",
+ "overly dramatic tiny-professional energy",
+ "gentle visual slapstick without distress",
+ "storybook-level silliness",
+ ],
+ )
+
+ config_builder.add_column(
+ dd.ImageColumnConfig(
+ name="base_pet_image",
+ prompt=BASE_PET_IMAGE_PROMPT,
+ model_alias=model_alias,
+ )
+ )
+
+ config_builder.add_column(
+ dd.ImageColumnConfig(
+ name="funny_pet_image",
+ prompt=FUNNY_PET_EDIT_PROMPT,
+ model_alias=model_alias,
+ multi_modal_context=[dd.ImageContext(column_name="base_pet_image")],
+ )
+ )
+
+ return config_builder
+
+
+def create_dataset(
+ config_builder: dd.DataDesignerConfigBuilder,
+ *,
+ num_records: int,
+ dataset_name: str,
+ artifact_path: Path | str | None = None,
+) -> DatasetCreationResults:
+ data_designer = DataDesigner(artifact_path=artifact_path)
+ data_designer.validate(config_builder)
+ return data_designer.create(config_builder, num_records=num_records, dataset_name=dataset_name)
+
+
+BASE_PET_IMAGE_PROMPT = """\
+Create a realistic synthetic pet photo.
+
+Image requirements:
+- Visual variation ID, for internal diversity only: {{ visual_variation_id }}
+- Pet type: {{ pet_type }}
+- Pet breed: {{ pet_breed }}
+- Pet age: {{ pet_age }}
+- Base activity: {{ base_activity }}
+- Base setting: {{ base_setting }}
+- Pet expression: {{ pet_expression }}
+- Photo style: {{ base_photo_style }}
+
+Show exactly one healthy, comfortable {{ pet_age }} {{ pet_breed }}. The pet
+should be the clear subject, fully visible enough to edit later, and safely
+posed in a harmless indoor or domestic setting. Use realistic fur, eyes,
+proportions, lighting, shadows, and background details appropriate for the pet
+type, breed, and age. Do not include text overlays, real brand logos,
+watermarks, captions, speech bubbles, unsafe handling, costumes that restrict
+movement, or distressed expressions. Generate exactly one final image for this
+row. Use the visual variation ID only as an internal diversity key; never
+render it as text.
+"""
+
+
+FUNNY_PET_EDIT_PROMPT = """\
+Edit the provided pet image to make the scene funnier while preserving the same pet.
+
+Edit requirements:
+- Visual variation ID, for internal diversity only: {{ visual_variation_id }}
+- Comedy edit goal: {{ comedy_edit_goal }}
+- Funny prop: {{ funny_prop }}
+- Scene escalation: {{ scene_escalation }}
+- Humor style: {{ humor_style }}
+
+Preserve the same pet identity from the reference image: same species, fur
+color, markings, face, body size, expression family, and core pose. Keep the
+pet safe, comfortable, healthy, and not distressed. Add playful props,
+background details, or scene context that make the image funnier, but keep the
+result as one coherent photo-like image rather than a collage.
+
+Do not change the pet into a different animal, add extra pets, add humans,
+add speech bubbles, add readable text, add letters, add numbers, add real brand
+logos, add watermarks, show unsafe handling, show distress, or make the prop
+appear tight, restrictive, or uncomfortable. If papers, signs, screens, labels,
+chalkboards, books, control panels, or trophies appear, they must be blank or
+use abstract colored shapes only. Generate exactly one final edited image for
+this row. Use the visual variation ID only as an internal diversity key; never
+render it as text.
+"""
+
+
+def parse_args() -> argparse.Namespace:
+ parser = argparse.ArgumentParser(description="Generate funny pet image edits.")
+ parser.add_argument("--num-records", type=int, default=10, help="Number of funny pet image rows to generate.")
+ parser.add_argument("--dataset-name", default="funny-pet-image-edits", help="Output dataset name.")
+ parser.add_argument("--artifact-path", type=Path, default=None, help="Optional Data Designer artifact directory.")
+ parser.add_argument("--model-provider", default=DEFAULT_MODEL_PROVIDER, help="Image model provider name.")
+ parser.add_argument("--model-id", default=DEFAULT_MODEL_ID, help="Provider model ID.")
+ parser.add_argument("--model-alias", default=DEFAULT_MODEL_ALIAS, help="Alias used by image columns.")
+ parser.add_argument("--image-size", default="1024", help="Provider-specific image size value.")
+ parser.add_argument("--aspect-ratio", default="4:3", help="Provider-specific aspect ratio value.")
+ parser.add_argument("--max-parallel-requests", type=int, default=10, help="Maximum parallel image requests.")
+ return parser.parse_args()
+
+
+def main() -> None:
+ args = parse_args()
+ config_builder = build_config(
+ model_provider=args.model_provider,
+ model_id=args.model_id,
+ model_alias=args.model_alias,
+ image_size=args.image_size,
+ aspect_ratio=args.aspect_ratio,
+ max_parallel_requests=args.max_parallel_requests,
+ )
+ results = create_dataset(
+ config_builder,
+ num_records=args.num_records,
+ dataset_name=args.dataset_name,
+ artifact_path=args.artifact_path,
+ )
+ dataset = results.load_dataset()
+ print(f"Generated {len(dataset)} funny pet image-edit rows.")
+ print(f"Dataset artifacts: {results.artifact_storage.base_dataset_path}")
+
+
+if __name__ == "__main__":
+ main()
+```
diff --git a/fern/versions/latest/pages/recipes/image_generation/humanoid_robot_scene_understanding.mdx b/fern/versions/latest/pages/recipes/image_generation/humanoid_robot_scene_understanding.mdx
new file mode 100644
index 000000000..cdf1e1e24
--- /dev/null
+++ b/fern/versions/latest/pages/recipes/image_generation/humanoid_robot_scene_understanding.mdx
@@ -0,0 +1,325 @@
+---
+title: "Humanoid Robot Scene Understanding"
+description: "Generate synthetic egocentric humanoid robot scenes with controlled environment, task, object, safety, viewpoint, lighting, and adult human-presence metadata."
+---
+
+ [Download the complete recipe script](https://github.com/NVIDIA-NeMo/DataDesigner/blob/main/fern/assets/recipes/image_generation/humanoid_robot_scene_understanding.py)
+
+
+```python
+# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+# /// script
+# requires-python = ">=3.10"
+# dependencies = [
+# "data-designer",
+# ]
+# ///
+"""Humanoid Robot Scene Understanding Image Generation Recipe
+
+Generate synthetic egocentric humanoid robot images with controlled variation
+over indoor environment, robot viewpoint, task goal, object set, scene state,
+safety condition, lighting, and adult human presence.
+
+Use the generated images for embodied-AI scene understanding, visual QA,
+reviewer calibration, safety review, and robotics demos where the image should
+look like a frame captured from the robot's own camera in a controlled setting.
+
+Prerequisites:
+ - An image-generation provider key for the selected model. The defaults use
+ OpenRouter, so set OPENROUTER_API_KEY before running.
+
+Run:
+ uv run humanoid_robot_scene_understanding.py --num-records 10
+"""
+
+from __future__ import annotations
+
+import argparse
+from pathlib import Path
+
+import data_designer.config as dd
+from data_designer.interface import DataDesigner, DatasetCreationResults
+
+DEFAULT_MODEL_PROVIDER = "openrouter"
+DEFAULT_MODEL_ID = "google/gemini-3.1-flash-image-preview"
+DEFAULT_MODEL_ALIAS = "humanoid-scene-model"
+
+
+def build_model_configs(
+ *,
+ model_provider: str,
+ model_id: str,
+ model_alias: str,
+ image_size: str,
+ aspect_ratio: str,
+ max_parallel_requests: int,
+) -> list[dd.ModelConfig]:
+ return [
+ dd.ModelConfig(
+ alias=model_alias,
+ model=model_id,
+ provider=model_provider,
+ inference_parameters=dd.ImageInferenceParams(
+ extra_body={
+ "n": 1,
+ "generationConfig": {
+ "imageConfig": {
+ "aspectRatio": aspect_ratio,
+ "imageSize": image_size,
+ }
+ },
+ },
+ max_parallel_requests=max_parallel_requests,
+ ),
+ skip_health_check=True,
+ )
+ ]
+
+
+def add_category(config_builder: dd.DataDesignerConfigBuilder, name: str, values: list[str]) -> None:
+ config_builder.add_column(
+ dd.SamplerColumnConfig(
+ name=name,
+ sampler_type=dd.SamplerType.CATEGORY,
+ params=dd.CategorySamplerParams(values=values),
+ )
+ )
+
+
+def add_visual_variation_id(config_builder: dd.DataDesignerConfigBuilder) -> None:
+ """Add a unique row-level key that discourages duplicate image generations."""
+ config_builder.add_column(
+ dd.SamplerColumnConfig(
+ name="visual_variation_id",
+ sampler_type=dd.SamplerType.UUID,
+ params=dd.UUIDSamplerParams(prefix="humanoid-", short_form=True),
+ )
+ )
+
+
+def build_config(
+ *,
+ model_provider: str = DEFAULT_MODEL_PROVIDER,
+ model_id: str = DEFAULT_MODEL_ID,
+ model_alias: str = DEFAULT_MODEL_ALIAS,
+ image_size: str = "1024",
+ aspect_ratio: str = "16:9",
+ max_parallel_requests: int = 10,
+) -> dd.DataDesignerConfigBuilder:
+ model_configs = build_model_configs(
+ model_provider=model_provider,
+ model_id=model_id,
+ model_alias=model_alias,
+ image_size=image_size,
+ aspect_ratio=aspect_ratio,
+ max_parallel_requests=max_parallel_requests,
+ )
+ config_builder = dd.DataDesignerConfigBuilder(model_configs=model_configs)
+ add_visual_variation_id(config_builder)
+
+ add_category(
+ config_builder,
+ "environment",
+ [
+ "teaching kitchen with counters, cabinets, and everyday objects",
+ "mock apartment living room arranged for assistive robotics",
+ "assisted living bedroom with bedside table and mobility aids",
+ "robotics lab workbench with tools and calibration objects",
+ "retail stockroom with shelves, totes, and handheld items",
+ "hospital supply room with carts, bins, and sealed supplies",
+ "office break room with appliances, tableware, and waste bins",
+ "laundry room with baskets, detergent, shelves, and folded towels",
+ "tool bench training area with bins, fasteners, and hand tools",
+ "grocery practice aisle with shelves, baskets, and fallen items",
+ ],
+ )
+ add_category(
+ config_builder,
+ "robot_viewpoint",
+ [
+ "head-mounted camera at standing adult height",
+ "chest-mounted camera with both robot hands barely visible at the bottom edge",
+ "slightly downward gaze toward a tabletop work surface",
+ "close manipulation view with one robot hand near the target object",
+ "wide room scan from a doorway before entering the scene",
+ "low crouched inspection angle looking under a table or cart",
+ ],
+ )
+ add_category(
+ config_builder,
+ "task_goal",
+ [
+ "locate the requested object before reaching",
+ "judge whether the path is safe to walk through",
+ "identify which objects are reachable from the current pose",
+ "verify that a cleanup task is complete",
+ "prepare a clear handoff area for an adult user",
+ "find the missing tool or supply item",
+ "inspect a spill or obstacle before moving closer",
+ "decide whether fragile items are too close to an edge",
+ ],
+ )
+ add_category(
+ config_builder,
+ "object_set",
+ [
+ "mug, kettle, sponge, dish towel, and cereal bowl",
+ "water glass, medication organizer, tissue box, and walking cane",
+ "pipette rack, beaker, nitrile gloves, and small screwdriver",
+ "barcode scanner, tote, tape dispenser, folded shirt, and box cutter",
+ "laundry basket, detergent bottle, folded towels, and loose sock",
+ "pliers, hex keys, small bolts, tape measure, and plastic bins",
+ "shopping basket, cereal boxes, soup cans, and fallen fruit",
+ "meal tray, sealed supplies, clipboard, and rolling cart",
+ ],
+ )
+ add_category(
+ config_builder,
+ "scene_state",
+ [
+ "organized and ready for the task",
+ "moderately cluttered but navigable",
+ "target object partly occluded by other items",
+ "target object moved to an unexpected location",
+ "container open with mixed contents visible",
+ "fragile item near the table edge",
+ "object stack unstable but still standing",
+ "task area partly blocked by a chair or cart",
+ ],
+ )
+ add_category(
+ config_builder,
+ "safety_condition",
+ [
+ "no visible hazard",
+ "small liquid spill on the floor",
+ "power cable crossing the walking path",
+ "sharp tool exposed on the work surface",
+ "hot appliance indicator light visible",
+ "glass object on the floor near the path",
+ "drawer left open at knee height",
+ "rolling cart partially blocking the doorway",
+ ],
+ )
+ add_category(
+ config_builder,
+ "human_presence",
+ [
+ "no person visible",
+ "adult worker's gloved hands visible at a safe distance",
+ "adult caregiver standing in the background with face turned away",
+ "adult shopper passing through the background, not identifiable",
+ "adult lab worker partially visible from shoulders down",
+ "adult office worker's arm visible near the handoff area",
+ ],
+ )
+ add_category(
+ config_builder,
+ "lighting",
+ [
+ "bright even lab lighting",
+ "warm apartment lighting",
+ "overcast window light",
+ "mixed overhead and task lighting",
+ "dim hallway light with localized task lamp",
+ "high-contrast backlighting from a nearby window",
+ ],
+ )
+
+ config_builder.add_column(
+ dd.ImageColumnConfig(
+ name="humanoid_scene_image",
+ prompt=HUMANOID_SCENE_PROMPT,
+ model_alias=model_alias,
+ )
+ )
+
+ return config_builder
+
+
+def create_dataset(
+ config_builder: dd.DataDesignerConfigBuilder,
+ *,
+ num_records: int,
+ dataset_name: str,
+ artifact_path: Path | str | None = None,
+) -> DatasetCreationResults:
+ data_designer = DataDesigner(artifact_path=artifact_path)
+ data_designer.validate(config_builder)
+ return data_designer.create(config_builder, num_records=num_records, dataset_name=dataset_name)
+
+
+HUMANOID_SCENE_PROMPT = """\
+Create a realistic egocentric humanoid robot scene-understanding image.
+
+The frame must look like it was captured from the humanoid robot's own camera
+inside a controlled indoor environment. Show the robot's viewpoint clearly:
+camera height, reachable workspace, path geometry, task-relevant objects,
+obstacles, and safety condition should all be visible enough for visual QA or
+embodied-AI scene understanding. If the viewpoint mentions robot hands, show at
+most one or two simple robot hands at the image edge; do not make the robot the
+main subject.
+
+Scene requirements:
+- Visual variation ID, for internal diversity only: {{ visual_variation_id }}
+- Environment: {{ environment }}
+- Robot viewpoint: {{ robot_viewpoint }}
+- Task goal: {{ task_goal }}
+- Object set: {{ object_set }}
+- Scene state: {{ scene_state }}
+- Safety condition: {{ safety_condition }}
+- Human presence: {{ human_presence }}
+- Lighting: {{ lighting }}
+
+Make the requested task goal, object set, scene state, and safety condition
+visually legible without adding labels or annotation graphics. Use realistic
+materials, clutter, occlusion, reachability cues, shadows, and indoor scale.
+
+Do not include children, identifiable faces, readable personal names, real
+company logos, surveillance UI, bounding boxes, arrows, captions, labels,
+watermarks, subtitles, HUD overlays, or diagnostic text. Generate exactly one
+final camera frame for this row. Do not return alternate versions, a grid, a
+pair of examples, before/after panels, or multiple frames. Use the visual
+variation ID only as an internal diversity key; never render it as text.
+"""
+
+
+def parse_args() -> argparse.Namespace:
+ parser = argparse.ArgumentParser(description="Generate synthetic humanoid robot scene-understanding images.")
+ parser.add_argument("--num-records", type=int, default=10, help="Number of humanoid scene images to generate.")
+ parser.add_argument("--dataset-name", default="humanoid-robot-scene-understanding", help="Output dataset name.")
+ parser.add_argument("--artifact-path", type=Path, default=None, help="Optional Data Designer artifact directory.")
+ parser.add_argument("--model-provider", default=DEFAULT_MODEL_PROVIDER, help="Image model provider name.")
+ parser.add_argument("--model-id", default=DEFAULT_MODEL_ID, help="Provider model ID.")
+ parser.add_argument("--model-alias", default=DEFAULT_MODEL_ALIAS, help="Alias used by image columns.")
+ parser.add_argument("--image-size", default="1024", help="Provider-specific image size value.")
+ parser.add_argument("--aspect-ratio", default="16:9", help="Provider-specific aspect ratio value.")
+ parser.add_argument("--max-parallel-requests", type=int, default=10, help="Maximum parallel image requests.")
+ return parser.parse_args()
+
+
+def main() -> None:
+ args = parse_args()
+ config_builder = build_config(
+ model_provider=args.model_provider,
+ model_id=args.model_id,
+ model_alias=args.model_alias,
+ image_size=args.image_size,
+ aspect_ratio=args.aspect_ratio,
+ max_parallel_requests=args.max_parallel_requests,
+ )
+ results = create_dataset(
+ config_builder,
+ num_records=args.num_records,
+ dataset_name=args.dataset_name,
+ artifact_path=args.artifact_path,
+ )
+ dataset = results.load_dataset()
+ print(f"Generated {len(dataset)} humanoid robot scene-understanding rows.")
+ print(f"Dataset artifacts: {results.artifact_storage.base_dataset_path}")
+
+
+if __name__ == "__main__":
+ main()
+```
diff --git a/fern/versions/latest/pages/recipes/image_generation/medical_extremity_xrays.mdx b/fern/versions/latest/pages/recipes/image_generation/medical_extremity_xrays.mdx
new file mode 100644
index 000000000..f18b7b60c
--- /dev/null
+++ b/fern/versions/latest/pages/recipes/image_generation/medical_extremity_xrays.mdx
@@ -0,0 +1,392 @@
+---
+title: "Synthetic Extremity X-rays"
+description: "Generate research-only synthetic extremity X-ray style images with controlled anatomy, view, acquisition, and finding metadata."
+---
+
+ [Download the complete recipe script](https://github.com/NVIDIA-NeMo/DataDesigner/blob/main/fern/assets/recipes/image_generation/medical_extremity_xrays.py)
+
+
+```python
+# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+# /// script
+# requires-python = ">=3.10"
+# dependencies = [
+# "data-designer",
+# ]
+# ///
+"""Synthetic Extremity X-ray Image Generation Recipe
+
+Generate synthetic extremity X-ray style images with controlled variation over
+anatomical region, view, imaging context, technical quality, and musculoskeletal
+findings.
+
+Medical disclaimer:
+ These generated images are synthetic and intended only for AI research,
+ education, data-pipeline prototyping, and evaluation workflows. They are not
+ real medical images and must not be used for diagnosis, treatment planning,
+ clinical decision-making, or as a substitute for real clinical validation.
+
+Prerequisites:
+ - An image-generation provider key for the selected model. The defaults use
+ OpenRouter, so set OPENROUTER_API_KEY before running.
+
+Run:
+ uv run medical_extremity_xrays.py --num-records 5
+"""
+
+from __future__ import annotations
+
+import argparse
+from pathlib import Path
+
+import data_designer.config as dd
+from data_designer.interface import DataDesigner, DatasetCreationResults
+
+DEFAULT_MODEL_PROVIDER = "openrouter"
+DEFAULT_MODEL_ID = "google/gemini-3.1-flash-image-preview"
+DEFAULT_MODEL_ALIAS = "medical-image-model"
+
+
+def build_model_configs(
+ *,
+ model_provider: str,
+ model_id: str,
+ model_alias: str,
+ image_size: str,
+ aspect_ratio: str,
+ max_parallel_requests: int,
+) -> list[dd.ModelConfig]:
+ """Build a provider-agnostic image-generation model config."""
+ return [
+ dd.ModelConfig(
+ alias=model_alias,
+ model=model_id,
+ provider=model_provider,
+ inference_parameters=dd.ImageInferenceParams(
+ extra_body={
+ "n": 1,
+ "generationConfig": {
+ "imageConfig": {
+ "aspectRatio": aspect_ratio,
+ "imageSize": image_size,
+ }
+ },
+ },
+ max_parallel_requests=max_parallel_requests,
+ ),
+ skip_health_check=True,
+ )
+ ]
+
+
+def add_category(config_builder: dd.DataDesignerConfigBuilder, name: str, values: list[str]) -> None:
+ """Add a categorical sampler column."""
+ config_builder.add_column(
+ dd.SamplerColumnConfig(
+ name=name,
+ sampler_type=dd.SamplerType.CATEGORY,
+ params=dd.CategorySamplerParams(values=values),
+ )
+ )
+
+
+def add_visual_variation_id(config_builder: dd.DataDesignerConfigBuilder) -> None:
+ """Add a unique row-level key that discourages duplicate image generations."""
+ config_builder.add_column(
+ dd.SamplerColumnConfig(
+ name="visual_variation_id",
+ sampler_type=dd.SamplerType.UUID,
+ params=dd.UUIDSamplerParams(prefix="xray-", short_form=True),
+ )
+ )
+
+
+def build_config(
+ *,
+ model_provider: str = DEFAULT_MODEL_PROVIDER,
+ model_id: str = DEFAULT_MODEL_ID,
+ model_alias: str = DEFAULT_MODEL_ALIAS,
+ image_size: str = "1024",
+ aspect_ratio: str = "1:1",
+ max_parallel_requests: int = 10,
+) -> dd.DataDesignerConfigBuilder:
+ """Build a synthetic extremity X-ray image-generation pipeline."""
+ model_configs = build_model_configs(
+ model_provider=model_provider,
+ model_id=model_id,
+ model_alias=model_alias,
+ image_size=image_size,
+ aspect_ratio=aspect_ratio,
+ max_parallel_requests=max_parallel_requests,
+ )
+ config_builder = dd.DataDesignerConfigBuilder(model_configs=model_configs)
+ add_visual_variation_id(config_builder)
+
+ add_category(
+ config_builder,
+ "patient_age_group",
+ [
+ "young adult",
+ "adult",
+ "middle-aged adult",
+ "older adult",
+ "geriatric adult",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "patient_sex",
+ [
+ "female",
+ "male",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "body_habitus",
+ [
+ "thin build",
+ "athletic build",
+ "average build",
+ "overweight build",
+ "obese build",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "anatomical_region",
+ [
+ "right shoulder",
+ "left shoulder",
+ "right humerus",
+ "left humerus",
+ "right elbow",
+ "left elbow",
+ "right forearm with radius and ulna",
+ "left forearm with radius and ulna",
+ "right wrist",
+ "left wrist",
+ "right hand and fingers",
+ "left hand and fingers",
+ "right hip",
+ "left hip",
+ "right femur",
+ "left femur",
+ "right knee",
+ "left knee",
+ "right tibia and fibula",
+ "left tibia and fibula",
+ "right ankle",
+ "left ankle",
+ "right foot and toes",
+ "left foot and toes",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "equipment_type",
+ [
+ "fixed radiography unit",
+ "portable X-ray machine",
+ "digital radiography system",
+ "computed radiography system",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "imaging_context",
+ [
+ "emergency department acute trauma",
+ "emergency department fall injury",
+ "emergency department sports injury",
+ "orthopedic clinic routine follow-up",
+ "post-operative hardware check",
+ "pre-operative planning",
+ "urgent care pain evaluation",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "xray_view",
+ [
+ "anteroposterior (AP)",
+ "lateral",
+ "oblique internal rotation",
+ "oblique external rotation",
+ "weight-bearing AP",
+ "stress view",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "exposure_quality",
+ [
+ "underexposed with cortical margins poorly defined",
+ "optimal exposure with clear cortical and trabecular detail",
+ "overexposed with washed out bone detail",
+ "low kVp technique with high bone contrast",
+ "high kVp technique with better soft tissue visualization",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "positioning",
+ [
+ "well-positioned true AP or lateral",
+ "slightly rotated",
+ "oblique positioning",
+ "splint or cast in place",
+ "traction device visible",
+ "suboptimal because the patient could not cooperate due to pain",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "primary_finding",
+ [
+ "normal with no acute osseous abnormality",
+ "nondisplaced fracture through the imaged bone",
+ "displaced fracture through the imaged bone",
+ "comminuted fracture involving the imaged bone",
+ "stress fracture line in the imaged bone",
+ "joint dislocation or subluxation in the imaged region",
+ "degenerative osteoarthritis in the imaged joint",
+ "suspected osteomyelitis with focal cortical destruction",
+ "soft tissue swelling with no acute fracture identified",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "secondary_findings",
+ [
+ "none",
+ "osteopenia",
+ "degenerative joint changes at adjacent joints",
+ "old healed fracture with callus formation",
+ "orthopedic plate and screws",
+ "intramedullary nail",
+ "joint effusion",
+ "soft tissue calcifications",
+ "vascular calcifications",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "image_quality",
+ [
+ "excellent sharp cortical margins and clear trabecular pattern",
+ "good adequate visualization of all bony structures",
+ "fair with mild motion artifact",
+ "fair with mild noise or graininess",
+ "fair with cast or splint partially obscuring detail",
+ "limited portable technique with technical limitations",
+ "limited by patient body habitus",
+ ],
+ )
+
+ config_builder.add_column(
+ dd.ImageColumnConfig(
+ name="extremity_xray",
+ prompt=EXTREMITY_XRAY_PROMPT,
+ model_alias=model_alias,
+ )
+ )
+
+ return config_builder
+
+
+def create_dataset(
+ config_builder: dd.DataDesignerConfigBuilder,
+ *,
+ num_records: int,
+ dataset_name: str,
+ artifact_path: Path | str | None = None,
+) -> DatasetCreationResults:
+ data_designer = DataDesigner(artifact_path=artifact_path)
+ data_designer.validate(config_builder)
+ return data_designer.create(config_builder, num_records=num_records, dataset_name=dataset_name)
+
+
+EXTREMITY_XRAY_PROMPT = """\
+Create a synthetic research-only grayscale X-ray style radiograph of the
+{{ anatomical_region }}, {{ xray_view }} view.
+
+Patient and acquisition context:
+- Visual variation ID, for internal diversity only: {{ visual_variation_id }}
+- Patient age group: {{ patient_age_group }}
+- Patient sex: {{ patient_sex }}
+- Body habitus: {{ body_habitus }}
+- Equipment: {{ equipment_type }}
+- Context: {{ imaging_context }}
+- Technical quality: {{ exposure_quality }}
+- Positioning: {{ positioning }}
+- Image quality: {{ image_quality }}
+
+Findings to depict:
+- Primary finding: {{ primary_finding }}
+- Secondary findings: {{ secondary_findings }}
+
+Use a realistic educational radiograph style with visible bones, joints, cortex,
+trabecular pattern, and soft-tissue silhouette. Include standard left/right
+markers where appropriate. Make the image look synthetic but useful for AI
+research and data-pipeline prototyping. Do not include real patient names, real
+medical record numbers, hospital logos, or any real protected health information.
+Generate exactly one final radiograph for this row. Do not return alternate
+versions, a two-view panel, a grid, a before/after image, duplicated views, or
+multiple image candidates. Use the visual variation ID only as an internal
+diversity key for anatomy framing, rotation, exposure texture, and soft-tissue
+background; never render it as text. Do not add diagnostic captions or
+explanatory text overlays.
+"""
+
+
+def parse_args() -> argparse.Namespace:
+ parser = argparse.ArgumentParser(description="Generate synthetic extremity X-ray style images.")
+ parser.add_argument("--num-records", type=int, default=5, help="Number of synthetic X-ray images to generate.")
+ parser.add_argument("--dataset-name", default="synthetic-extremity-xrays", help="Output dataset name.")
+ parser.add_argument("--artifact-path", type=Path, default=None, help="Optional Data Designer artifact directory.")
+ parser.add_argument("--model-provider", default=DEFAULT_MODEL_PROVIDER, help="Image model provider name.")
+ parser.add_argument("--model-id", default=DEFAULT_MODEL_ID, help="Provider model ID.")
+ parser.add_argument("--model-alias", default=DEFAULT_MODEL_ALIAS, help="Alias used by image columns.")
+ parser.add_argument("--image-size", default="1024", help="Provider-specific image size value.")
+ parser.add_argument("--aspect-ratio", default="1:1", help="Provider-specific aspect ratio value.")
+ parser.add_argument("--max-parallel-requests", type=int, default=10, help="Maximum parallel image requests.")
+ return parser.parse_args()
+
+
+def main() -> None:
+ args = parse_args()
+ config_builder = build_config(
+ model_provider=args.model_provider,
+ model_id=args.model_id,
+ model_alias=args.model_alias,
+ image_size=args.image_size,
+ aspect_ratio=args.aspect_ratio,
+ max_parallel_requests=args.max_parallel_requests,
+ )
+ results = create_dataset(
+ config_builder,
+ num_records=args.num_records,
+ dataset_name=args.dataset_name,
+ artifact_path=args.artifact_path,
+ )
+ dataset = results.load_dataset()
+ print(f"Generated {len(dataset)} synthetic extremity X-ray rows.")
+ print(f"Dataset artifacts: {results.artifact_storage.base_dataset_path}")
+
+
+if __name__ == "__main__":
+ main()
+```
diff --git a/fern/versions/latest/pages/recipes/image_generation/product_image_variations.mdx b/fern/versions/latest/pages/recipes/image_generation/product_image_variations.mdx
new file mode 100644
index 000000000..d99dc1fba
--- /dev/null
+++ b/fern/versions/latest/pages/recipes/image_generation/product_image_variations.mdx
@@ -0,0 +1,462 @@
+---
+title: "Product Image Variations"
+description: "Use image-to-image generation to create inclusive apparel catalog variants while preserving garment identity."
+---
+
+ [Download the complete recipe script](https://github.com/NVIDIA-NeMo/DataDesigner/blob/main/fern/assets/recipes/image_generation/product_image_variations.py)
+
+
+```python
+# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+# /// script
+# requires-python = ">=3.10"
+# dependencies = [
+# "data-designer",
+# ]
+# ///
+"""Product Image Variation Recipe
+
+Generate a base apparel-on-person catalog image, then create inclusive fashion
+catalog variations with an image-to-image model through ImageContext. Use this pattern
+for e-commerce apparel variants, fit and styling coverage, marketplace
+thumbnails, lookbook imagery, and creative QA workflows.
+
+For real product images, replace the `base_product_image` generation column with
+a seed dataset column containing your product image paths, URLs, or base64 data,
+then point `ImageContext(column_name=...)` at that seed column.
+
+Prerequisites:
+ - An image-generation provider key for a model that supports image-to-image
+ editing through the chat-completions route. The defaults use OpenRouter
+ and Gemini 3.1 Flash Image Preview, so set OPENROUTER_API_KEY before running.
+
+Run:
+ uv run product_image_variations.py --num-records 5
+"""
+
+from __future__ import annotations
+
+import argparse
+from pathlib import Path
+
+import data_designer.config as dd
+from data_designer.interface import DataDesigner, DatasetCreationResults
+
+DEFAULT_MODEL_PROVIDER = "openrouter"
+DEFAULT_MODEL_ID = "google/gemini-3.1-flash-image-preview"
+DEFAULT_MODEL_ALIAS = "product-image-model"
+
+
+def build_model_configs(
+ *,
+ model_provider: str,
+ model_id: str,
+ model_alias: str,
+ image_size: str,
+ aspect_ratio: str,
+ max_parallel_requests: int,
+) -> list[dd.ModelConfig]:
+ """Build an image model config for text-to-image and image-to-image generation."""
+ return [
+ dd.ModelConfig(
+ alias=model_alias,
+ model=model_id,
+ provider=model_provider,
+ inference_parameters=dd.ImageInferenceParams(
+ extra_body={
+ "n": 1,
+ "generationConfig": {
+ "imageConfig": {
+ "aspectRatio": aspect_ratio,
+ "imageSize": image_size,
+ }
+ },
+ },
+ max_parallel_requests=max_parallel_requests,
+ ),
+ skip_health_check=True,
+ )
+ ]
+
+
+def add_category(
+ config_builder: dd.DataDesignerConfigBuilder,
+ name: str,
+ values: list[str],
+ weights: list[float] | None = None,
+) -> None:
+ """Add a categorical sampler column."""
+ config_builder.add_column(
+ dd.SamplerColumnConfig(
+ name=name,
+ sampler_type=dd.SamplerType.CATEGORY,
+ params=dd.CategorySamplerParams(values=values, weights=weights),
+ )
+ )
+
+
+def add_visual_variation_id(config_builder: dd.DataDesignerConfigBuilder) -> None:
+ """Add a unique row-level key that discourages duplicate image generations."""
+ config_builder.add_column(
+ dd.SamplerColumnConfig(
+ name="visual_variation_id",
+ sampler_type=dd.SamplerType.UUID,
+ params=dd.UUIDSamplerParams(prefix="apparel-", short_form=True),
+ )
+ )
+
+
+def build_config(
+ *,
+ model_provider: str = DEFAULT_MODEL_PROVIDER,
+ model_id: str = DEFAULT_MODEL_ID,
+ model_alias: str = DEFAULT_MODEL_ALIAS,
+ image_size: str = "1024",
+ aspect_ratio: str = "3:4",
+ max_parallel_requests: int = 10,
+) -> dd.DataDesignerConfigBuilder:
+ """Build an apparel product image variation pipeline."""
+ model_configs = build_model_configs(
+ model_provider=model_provider,
+ model_id=model_id,
+ model_alias=model_alias,
+ image_size=image_size,
+ aspect_ratio=aspect_ratio,
+ max_parallel_requests=max_parallel_requests,
+ )
+ config_builder = dd.DataDesignerConfigBuilder(model_configs=model_configs)
+ add_visual_variation_id(config_builder)
+
+ add_category(
+ config_builder,
+ "apparel_item",
+ [
+ "organic cotton crewneck t-shirt",
+ "lightweight denim jacket",
+ "water-resistant rain jacket",
+ "relaxed-fit hoodie",
+ "wide-leg linen trousers",
+ "ribbed knit cardigan",
+ "quilted puffer vest",
+ "stretch woven workwear shirt",
+ "ankle-length everyday dress",
+ "adaptive zip-front jacket",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "base_colorway",
+ [
+ "matte black",
+ "warm white",
+ "sage green",
+ "deep navy",
+ "brushed silver",
+ "terracotta",
+ "soft lavender",
+ "charcoal gray",
+ "sunflower yellow",
+ "denim blue",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "base_view",
+ [
+ "front-facing standing full-body catalog photo with one synthetic adult model",
+ "three-quarter standing full-body catalog photo with one synthetic adult model",
+ "side-angle standing full-body catalog photo with one synthetic adult model",
+ "walking-pose full-body catalog photo with one synthetic adult model",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "base_model_profile",
+ [
+ "young adult Black model with an athletic build",
+ "middle-aged East Asian model with an average build",
+ "older Latine model with a petite build",
+ "young adult South Asian model with a plus-size build",
+ "middle-aged Middle Eastern model with a tall build",
+ "young adult Indigenous model with a broad-shouldered build",
+ "older White model with a curvy build",
+ "young adult multiracial model with a slender build",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "variation_goal",
+ [
+ "inclusive e-commerce catalog image on a clean white background",
+ "lifestyle lookbook image in an everyday urban setting",
+ "fit-guide image showing garment drape and silhouette",
+ "seasonal campaign image for cool-weather layering",
+ "adaptive-fashion catalog image emphasizing ease of wear",
+ "single-model adult age-inclusive catalog image",
+ "social media campaign image with bold colored backdrop",
+ "editorial fashion image with soft premium lighting",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "edit_scene_delta",
+ [
+ "move from a neutral studio catalog reference into an outdoor urban lifestyle scene",
+ "move from a neutral studio catalog reference into a warm home entryway lookbook scene",
+ "move from a neutral studio catalog reference into a bold color-block campaign set",
+ "move from a neutral studio catalog reference into a smart-casual workplace corridor scene",
+ "move from a neutral studio catalog reference into a weekend park lookbook scene",
+ "move from a neutral studio catalog reference into a premium editorial studio set with draped fabric",
+ "move from a neutral studio catalog reference into a clean fit-guide scene with a new full-body pose",
+ "move from a neutral studio catalog reference into a seasonal layering scene with visible outerwear styling",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "model_age_group",
+ [
+ "young adult model",
+ "middle-aged adult model",
+ "older adult model",
+ "senior adult model",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "model_ethnicity",
+ [
+ "Black or African diaspora model",
+ "East Asian model",
+ "South Asian model",
+ "Latine model",
+ "Middle Eastern or North African model",
+ "Indigenous model",
+ "Pacific Islander model",
+ "White or European model",
+ "multiracial model",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "body_type",
+ [
+ "petite build",
+ "tall build",
+ "plus-size build",
+ "athletic build",
+ "broad-shouldered build",
+ "curvy build",
+ "slender build",
+ "average build",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "accessibility_context",
+ [
+ "standing model without visible mobility aids",
+ "model with no specific accessibility cue",
+ "standing model in a relaxed catalog pose",
+ "model walking naturally without visible mobility aids",
+ "model seated on a simple studio stool",
+ "model leaning lightly against a studio block",
+ "model holding a small neutral accessory",
+ "seated model using a wheelchair",
+ "model with a visible prosthetic limb",
+ "model using forearm crutches",
+ ],
+ weights=[1.6, 1.6, 1.4, 1.4, 0.9, 0.9, 0.9, 0.2, 0.2, 0.2],
+ )
+
+ add_category(
+ config_builder,
+ "styling_context",
+ [
+ "pure white seamless catalog background",
+ "soft neutral studio backdrop",
+ "outdoor morning city street",
+ "modern home entryway",
+ "adult campus casual setting",
+ "workplace smart-casual setting",
+ "weekend park setting",
+ "minimal geometric studio set",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "composition",
+ [
+ "front-facing full-body catalog pose with the entire person visible",
+ "three-quarter full-body pose with the entire person visible",
+ "single seated full-body pose showing garment fit with the whole body visible",
+ "single walking full-body pose with natural garment movement",
+ "side-angle full-body pose with clear garment silhouette",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "lighting",
+ [
+ "softbox studio lighting",
+ "natural window light",
+ "bright catalog lighting",
+ "warm golden-hour lighting",
+ "soft overcast outdoor light",
+ ],
+ )
+
+ config_builder.add_column(
+ dd.ImageColumnConfig(
+ name="base_product_image",
+ prompt=BASE_PRODUCT_IMAGE_PROMPT,
+ model_alias=model_alias,
+ )
+ )
+
+ config_builder.add_column(
+ dd.ImageColumnConfig(
+ name="product_variant_image",
+ prompt=PRODUCT_VARIATION_PROMPT,
+ model_alias=model_alias,
+ multi_modal_context=[dd.ImageContext(column_name="base_product_image")],
+ )
+ )
+
+ return config_builder
+
+
+def create_dataset(
+ config_builder: dd.DataDesignerConfigBuilder,
+ *,
+ num_records: int,
+ dataset_name: str,
+ artifact_path: Path | str | None = None,
+) -> DatasetCreationResults:
+ data_designer = DataDesigner(artifact_path=artifact_path)
+ data_designer.validate(config_builder)
+ return data_designer.create(config_builder, num_records=num_records, dataset_name=dataset_name)
+
+
+BASE_PRODUCT_IMAGE_PROMPT = """\
+Create a synthetic apparel catalog reference photo of a person wearing a {{ base_colorway }} {{ apparel_item }}.
+
+Image requirements:
+- Visual variation ID, for internal diversity only: {{ visual_variation_id }}
+- View: {{ base_view }}
+- Base model profile: {{ base_model_profile }}
+- Background: clean neutral studio background
+- Lighting: soft catalog lighting
+- Use exactly one synthetic adult model in neutral catalog styling.
+- Output must use vertical portrait framing, with a 3:4 portrait composition that is taller than it is wide.
+- The garment should be centered, worn naturally, fully visible, and isolated enough to edit later.
+- The frame must be a full-body image: show the model from head to toe with feet visible and comfortable margins around the body.
+- Show fabric texture, seams, silhouette, cuffs, closures, pockets, fit, drape, and other garment details when relevant.
+- Keep the model presentation neutral, fully clothed, non-sexualized, and commercially appropriate.
+- Do not include extra people, duplicate bodies, mannequins, cropped bodies, close-up crops, landscape frames, square frames, real brand logos, real trademarks, watermarks, price tags, text overlays, celebrity likenesses, or real people.
+- Use a plausible invented garment design with consistent shape, color, fit, and material details.
+- Generate exactly one final image for this row. Do not return alternate versions, a grid, a pair of examples, a before/after image, or multiple panels. Use the visual variation ID only as an internal diversity key; never render it as text.
+"""
+
+
+PRODUCT_VARIATION_PROMPT = """\
+Edit the provided apparel-on-person catalog image into a new inclusive commercial fashion image.
+
+Variation requirements:
+- Visual variation ID, for internal diversity only: {{ visual_variation_id }}
+- Variation goal: {{ variation_goal }}
+- Required edit delta: {{ edit_scene_delta }}
+- Model age group: {{ model_age_group }}
+- Model ethnicity: {{ model_ethnicity }}
+- Body type: {{ body_type }}
+- Accessibility context: {{ accessibility_context }}
+- Styling context: {{ styling_context }}
+- Composition: {{ composition }}
+- Lighting: {{ lighting }}
+
+Use the provided image as a garment reference, not as a full-shot template.
+Preserve the garment's core identity from the reference image: same apparel
+item, same primary colorway, same fabric cues, same silhouette, same fit
+behavior, and same distinctive design details. Do not preserve the original
+person, original face, original hair, original stance, original camera angle,
+or original plain studio background unless that exact choice is requested by
+the sampled controls.
+
+Create a visibly different commercial variant. The final image should make at
+least three obvious changes from the reference photo: a different synthetic
+adult model, a different full-body pose or body orientation, a different
+setting or background, and a different lighting or campaign style. Change the
+person wearing the garment to match the requested synthetic adult model
+background, age group, body type, and accessibility context. Represent the
+adult model respectfully and without stereotypes. Every generated person must
+clearly be 18 or older.
+
+The edited output must show exactly one person. The final image must be a
+vertical 3:4 portrait full-body catalog image that is taller than it is wide:
+show the full head-to-toe body with feet visible, or the full seated body and
+full mobility aid when the accessibility context calls for one. Do not crop at
+the face, waist, knees, ankles, hands, or garment hem. Do not create landscape
+frames, square frames, group shots, mirrored duplicates, before/after
+composites, multiple models, mannequins, or background bystanders.
+
+Follow the required edit delta when changing the surrounding scene, styling,
+pose, background, and lighting. Keep the result realistic and commercially
+usable. Do not add real brand logos, real trademarks, watermarks, price tags,
+text overlays, sexualized styling, swimwear, underwear, lingerie, sheer
+clothing, or revealing poses.
+Generate exactly one final edited image for this row. Do not return alternate
+versions, a grid, a pair of examples, a before/after image, or multiple panels.
+Use the visual variation ID only as an internal diversity key; never render it
+as text.
+"""
+
+
+def parse_args() -> argparse.Namespace:
+ parser = argparse.ArgumentParser(description="Generate product image variations with image-to-image editing.")
+ parser.add_argument("--num-records", type=int, default=5, help="Number of product variation rows to generate.")
+ parser.add_argument("--dataset-name", default="product-image-variations", help="Output dataset name.")
+ parser.add_argument("--artifact-path", type=Path, default=None, help="Optional Data Designer artifact directory.")
+ parser.add_argument("--model-provider", default=DEFAULT_MODEL_PROVIDER, help="Image model provider name.")
+ parser.add_argument("--model-id", default=DEFAULT_MODEL_ID, help="Provider model ID.")
+ parser.add_argument("--model-alias", default=DEFAULT_MODEL_ALIAS, help="Alias used by image columns.")
+ parser.add_argument("--image-size", default="1024", help="Provider-specific image size value.")
+ parser.add_argument("--aspect-ratio", default="3:4", help="Provider-specific aspect ratio value.")
+ parser.add_argument("--max-parallel-requests", type=int, default=10, help="Maximum parallel image requests.")
+ return parser.parse_args()
+
+
+def main() -> None:
+ args = parse_args()
+ config_builder = build_config(
+ model_provider=args.model_provider,
+ model_id=args.model_id,
+ model_alias=args.model_alias,
+ image_size=args.image_size,
+ aspect_ratio=args.aspect_ratio,
+ max_parallel_requests=args.max_parallel_requests,
+ )
+ results = create_dataset(
+ config_builder,
+ num_records=args.num_records,
+ dataset_name=args.dataset_name,
+ artifact_path=args.artifact_path,
+ )
+ dataset = results.load_dataset()
+ print(f"Generated {len(dataset)} product variation rows.")
+ print(f"Dataset artifacts: {results.artifact_storage.base_dataset_path}")
+
+
+if __name__ == "__main__":
+ main()
+```
diff --git a/fern/versions/latest/pages/recipes/image_generation/rich_document_images.mdx b/fern/versions/latest/pages/recipes/image_generation/rich_document_images.mdx
new file mode 100644
index 000000000..94f8a58c6
--- /dev/null
+++ b/fern/versions/latest/pages/recipes/image_generation/rich_document_images.mdx
@@ -0,0 +1,456 @@
+---
+title: "Rich Document Image Generation"
+description: "Generate synthetic business-document page images with controlled visual variation for VQA, OCR, and multimodal evaluation workflows."
+---
+
+ [Download the complete recipe script](https://github.com/NVIDIA-NeMo/DataDesigner/blob/main/fern/assets/recipes/image_generation/rich_document_images.py)
+
+
+```python
+# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+# /// script
+# requires-python = ">=3.10"
+# dependencies = [
+# "data-designer",
+# "pandas",
+# "pyarrow",
+# ]
+# ///
+"""Rich Document Image Generation Recipe
+
+Generate synthetic business-document page images with controlled variation.
+Each generated row pairs an image with the metadata that produced it, making
+the output useful as seed data for visual QA, OCR robustness, multimodal
+judging, and document-understanding experiments.
+
+Prerequisites:
+ - An image-generation provider key for the selected model. The defaults use
+ OpenRouter, so set OPENROUTER_API_KEY before running.
+
+Run:
+ # Generate 5 rich document images with the default OpenRouter model.
+ uv run rich_document_images.py --num-records 5
+
+ # Export a VQA-ready seed parquet with base64 PNGs plus orientation fields.
+ uv run rich_document_images.py --num-records 25 --export-seed rich_document_seed.parquet
+
+ # Use a different provider or image model.
+ uv run rich_document_images.py --model-provider openrouter --model-id google/gemini-3.1-flash-image-preview
+"""
+
+from __future__ import annotations
+
+import argparse
+import base64
+from collections.abc import Sequence
+from pathlib import Path
+
+import pandas as pd
+
+import data_designer.config as dd
+from data_designer.interface import DataDesigner, DatasetCreationResults
+
+DEFAULT_MODEL_PROVIDER = "openrouter"
+DEFAULT_MODEL_ID = "google/gemini-3.1-flash-image-preview"
+DEFAULT_MODEL_ALIAS = "document-generation-model"
+
+SEED_METADATA_COLUMNS = [
+ "document_type",
+ "primary_visual",
+ "secondary_visual",
+ "layout_style",
+ "document_condition",
+]
+
+
+def build_model_configs(
+ *,
+ model_provider: str,
+ model_id: str,
+ model_alias: str,
+ image_size: str,
+ aspect_ratio: str,
+ max_parallel_requests: int,
+) -> list[dd.ModelConfig]:
+ """Build a provider-agnostic image-generation model config."""
+ return [
+ dd.ModelConfig(
+ alias=model_alias,
+ model=model_id,
+ provider=model_provider,
+ inference_parameters=dd.ImageInferenceParams(
+ extra_body={
+ "n": 1,
+ "generationConfig": {
+ "imageConfig": {
+ "aspectRatio": aspect_ratio,
+ "imageSize": image_size,
+ }
+ },
+ },
+ max_parallel_requests=max_parallel_requests,
+ ),
+ skip_health_check=True,
+ )
+ ]
+
+
+def add_category(
+ config_builder: dd.DataDesignerConfigBuilder,
+ name: str,
+ values: list[str],
+ weights: list[float] | None = None,
+) -> None:
+ """Add a categorical sampler column."""
+ config_builder.add_column(
+ dd.SamplerColumnConfig(
+ name=name,
+ sampler_type=dd.SamplerType.CATEGORY,
+ params=dd.CategorySamplerParams(values=values, weights=weights),
+ )
+ )
+
+
+def add_visual_variation_id(config_builder: dd.DataDesignerConfigBuilder) -> None:
+ """Add a unique row-level key that discourages duplicate image generations."""
+ config_builder.add_column(
+ dd.SamplerColumnConfig(
+ name="visual_variation_id",
+ sampler_type=dd.SamplerType.UUID,
+ params=dd.UUIDSamplerParams(prefix="doc-", short_form=True),
+ )
+ )
+
+
+def build_config(
+ *,
+ model_provider: str = DEFAULT_MODEL_PROVIDER,
+ model_id: str = DEFAULT_MODEL_ID,
+ model_alias: str = DEFAULT_MODEL_ALIAS,
+ image_size: str = "1024",
+ aspect_ratio: str = "2:3",
+ max_parallel_requests: int = 10,
+) -> dd.DataDesignerConfigBuilder:
+ """Build a rich document image-generation pipeline."""
+ model_configs = build_model_configs(
+ model_provider=model_provider,
+ model_id=model_id,
+ model_alias=model_alias,
+ image_size=image_size,
+ aspect_ratio=aspect_ratio,
+ max_parallel_requests=max_parallel_requests,
+ )
+ config_builder = dd.DataDesignerConfigBuilder(model_configs=model_configs)
+ add_visual_variation_id(config_builder)
+
+ add_category(
+ config_builder,
+ "document_type",
+ [
+ "quarterly business review",
+ "market research brief",
+ "operations dashboard export",
+ "clinical trial status report",
+ "sustainability impact report",
+ "financial variance memo",
+ "customer support incident review",
+ "supply chain risk assessment",
+ "product launch readiness plan",
+ "employee engagement summary",
+ ],
+ weights=[0.12, 0.10, 0.14, 0.08, 0.08, 0.12, 0.12, 0.10, 0.12, 0.12],
+ )
+
+ add_category(
+ config_builder,
+ "organization_name",
+ [
+ "Aster Analytics",
+ "Blue Ridge Health",
+ "CedarWorks Manufacturing",
+ "DeltaGrid Energy",
+ "Evergreen Mobility",
+ "Harborlight Retail",
+ "Northstar Robotics",
+ "Redwood BioSystems",
+ "Summit Cloud Services",
+ "Valley Forge Logistics",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "document_owner",
+ [
+ "Maya Chen",
+ "Jonas Patel",
+ "Elena Garcia",
+ "Noah Williams",
+ "Amara Okafor",
+ "Theo Martin",
+ "Priya Raman",
+ "Sofia Rossi",
+ "Lena Fischer",
+ "Caleb Brooks",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "owner_role",
+ [
+ "VP Operations",
+ "Finance Director",
+ "Clinical Program Manager",
+ "Customer Success Lead",
+ "Risk Officer",
+ "Product Launch Owner",
+ "People Analytics Partner",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "audience",
+ [
+ "executive leadership",
+ "finance review committee",
+ "field operations managers",
+ "clinical program leads",
+ "board audit committee",
+ "customer success directors",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "content_theme",
+ [
+ "quarterly revenue performance and forecast variance",
+ "regional customer adoption and churn risk",
+ "service-level agreement compliance and incident aging",
+ "inventory throughput, backorders, and supplier delays",
+ "trial enrollment, site activation, and adverse event counts",
+ "energy consumption, emissions, and sustainability targets",
+ "hiring funnel conversion, offer acceptance, and attrition",
+ "product launch milestones, owners, and readiness status",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "primary_visual",
+ [
+ "clustered bar chart comparing three regions across four quarters",
+ "line chart with two series, annotated inflection points, and a target band",
+ "stacked area chart showing category mix over six months",
+ "waterfall chart showing contributors to budget variance",
+ "scatter plot with labeled outliers and a trend line",
+ "Gantt-style timeline with milestones and owner initials",
+ "heatmap matrix with risk severity by team and region",
+ "donut chart with callout labels and percentages",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "secondary_visual",
+ [
+ "dense financial table with subtotals and variance arrows",
+ "KPI card row with current value, target, delta, and traffic-light status",
+ "two-column risk register with owner, due date, and mitigation note",
+ "small process diagram with arrows between four labeled stages",
+ "ranked list table with sparklines in the final column",
+ "compact map inset with region labels and numeric badges",
+ "executive callout box with three bullet conclusions",
+ "signature block plus approval checklist",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "layout_style",
+ [
+ "clean consulting report page with narrow margins and section dividers",
+ "dashboard export with a top filter bar and grid of panels",
+ "formal memo with letterhead, dense paragraphs, and one embedded chart",
+ "board-pack page with title ribbon, footnotes, and small-print source notes",
+ "compliance form with checkboxes, tables, and stamped approval",
+ "research brief with abstract, sidebar definitions, and figure captions",
+ "operations one-pager with color-coded status chips and action table",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "document_condition",
+ [
+ "pristine exported PDF screenshot",
+ "high-resolution office scanner output",
+ "faded photocopy with mild paper texture",
+ "creased printout with a clipped corner",
+ "low-contrast scan with light shadow near the binding edge",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "annotation_layer",
+ [
+ "no manual annotations",
+ "yellow highlights over two key numbers",
+ "red pen circle around one chart outlier",
+ "blue sticky note partially covering the lower right table",
+ "handwritten margin note asking for follow-up",
+ "rubber stamp reading DRAFT across the header",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "numeric_context",
+ [
+ "include values in thousands with one decimal place",
+ "include percentages, basis-point deltas, and small footnotes",
+ "include dates across the next six months",
+ "include currency values, totals, and year-over-year deltas",
+ "include counts by region plus a total row",
+ ],
+ )
+
+ config_builder.add_column(
+ dd.ImageColumnConfig(
+ name="document_image",
+ prompt=RICH_DOCUMENT_IMAGE_PROMPT,
+ model_alias=model_alias,
+ )
+ )
+
+ return config_builder
+
+
+def create_dataset(
+ config_builder: dd.DataDesignerConfigBuilder,
+ *,
+ num_records: int,
+ dataset_name: str,
+ artifact_path: Path | str | None = None,
+) -> DatasetCreationResults:
+ data_designer = DataDesigner(artifact_path=artifact_path)
+ data_designer.validate(config_builder)
+ return data_designer.create(config_builder, num_records=num_records, dataset_name=dataset_name)
+
+
+def export_seed_parquet(results: DatasetCreationResults, output_path: Path) -> None:
+ """Export generated images as base64 PNG seed rows for VLM pipelines."""
+ dataset = results.load_dataset()
+ base_path = results.artifact_storage.base_dataset_path
+ rows: list[dict[str, str]] = []
+
+ for _, row in dataset.iterrows():
+ image_ref = _first_image_ref(row["document_image"])
+ image_path = base_path / image_ref
+ output_row = {
+ "png_base64": base64.b64encode(image_path.read_bytes()).decode("utf-8"),
+ }
+ output_row.update({column: row[column] for column in SEED_METADATA_COLUMNS})
+ rows.append(output_row)
+
+ output_path.parent.mkdir(parents=True, exist_ok=True)
+ pd.DataFrame(rows).to_parquet(output_path, index=False)
+
+
+def _first_image_ref(value: object) -> str:
+ if isinstance(value, str):
+ return value
+ if isinstance(value, Sequence) and value:
+ first = value[0]
+ if isinstance(first, str):
+ return first
+ raise ValueError(f"Expected document_image to be a string path or non-empty sequence, got {type(value)!r}")
+
+
+RICH_DOCUMENT_IMAGE_PROMPT = """\
+Create a realistic single-page business document image with rich visual information.
+
+Document requirements:
+- Visual variation ID, for internal diversity only: {{ visual_variation_id }}
+- Document type: {{ document_type }}
+- Organization: {{ organization_name }}
+- Document owner: {{ document_owner }}, {{ owner_role }}
+- Intended audience: {{ audience }}
+- Theme: {{ content_theme }}
+- Layout style: {{ layout_style }}
+- Physical/rendering condition: {{ document_condition }}
+- Annotation layer: {{ annotation_layer }}
+- Numeric style: {{ numeric_context }}
+
+Required visual content:
+- Primary visual: {{ primary_visual }}
+- Secondary visual: {{ secondary_visual }}
+- At least one readable table with row and column labels
+- At least one chart, timeline, heatmap, diagram, or KPI-card cluster
+- A clear title, date, organization name, document owner, section headings, and small source note
+- Enough readable text to ask visual QA questions about exact values, trends, labels, owners, dates, and relationships
+
+Make the page visually dense but professionally designed. Use realistic fonts,
+alignment, legends, axis labels, table borders, captions, and spacing. The text
+and numbers should be legible. Avoid blank areas, generic placeholder blocks,
+or lorem ipsum. Generate exactly one final document page for this row. Do not
+return alternate versions, a grid, a pair of examples, before/after panels, or
+multiple pages. Use the visual variation ID only as an internal diversity key;
+never render it as text. Do not include real company logos or real personal
+data.
+"""
+
+
+def parse_args() -> argparse.Namespace:
+ parser = argparse.ArgumentParser(description="Generate rich synthetic business-document images.")
+ parser.add_argument("--num-records", type=int, default=5, help="Number of document images to generate.")
+ parser.add_argument("--dataset-name", default="rich-document-images", help="Output dataset name.")
+ parser.add_argument("--artifact-path", type=Path, default=None, help="Optional Data Designer artifact directory.")
+ parser.add_argument("--model-provider", default=DEFAULT_MODEL_PROVIDER, help="Image model provider name.")
+ parser.add_argument("--model-id", default=DEFAULT_MODEL_ID, help="Provider model ID.")
+ parser.add_argument("--model-alias", default=DEFAULT_MODEL_ALIAS, help="Alias used by image columns.")
+ parser.add_argument("--image-size", default="1024", help="Provider-specific image size value.")
+ parser.add_argument("--aspect-ratio", default="2:3", help="Provider-specific aspect ratio value.")
+ parser.add_argument("--max-parallel-requests", type=int, default=10, help="Maximum parallel image requests.")
+ parser.add_argument(
+ "--export-seed",
+ type=Path,
+ default=None,
+ help="Optional parquet path for a VQA-ready seed with base64 PNGs and orientation fields.",
+ )
+ return parser.parse_args()
+
+
+def main() -> None:
+ args = parse_args()
+ config_builder = build_config(
+ model_provider=args.model_provider,
+ model_id=args.model_id,
+ model_alias=args.model_alias,
+ image_size=args.image_size,
+ aspect_ratio=args.aspect_ratio,
+ max_parallel_requests=args.max_parallel_requests,
+ )
+ results = create_dataset(
+ config_builder,
+ num_records=args.num_records,
+ dataset_name=args.dataset_name,
+ artifact_path=args.artifact_path,
+ )
+
+ dataset = results.load_dataset()
+ print(f"Generated {len(dataset)} rich document image rows.")
+ print(f"Dataset artifacts: {results.artifact_storage.base_dataset_path}")
+
+ if args.export_seed is not None:
+ export_seed_parquet(results, args.export_seed)
+ print(f"Exported VQA seed parquet: {args.export_seed}")
+
+
+if __name__ == "__main__":
+ main()
+```
diff --git a/fern/versions/latest/pages/recipes/image_generation/traffic_scenarios.mdx b/fern/versions/latest/pages/recipes/image_generation/traffic_scenarios.mdx
new file mode 100644
index 000000000..2d0db18cc
--- /dev/null
+++ b/fern/versions/latest/pages/recipes/image_generation/traffic_scenarios.mdx
@@ -0,0 +1,443 @@
+---
+title: "Autonomous Vehicle Traffic Scenarios"
+description: "Generate synthetic autonomous-vehicle ego-camera images with controlled road, weather, lighting, and long-tail hazard variation."
+---
+
+ [Download the complete recipe script](https://github.com/NVIDIA-NeMo/DataDesigner/blob/main/fern/assets/recipes/image_generation/traffic_scenarios.py)
+
+
+```python
+# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+# /// script
+# requires-python = ">=3.10"
+# dependencies = [
+# "data-designer",
+# ]
+# ///
+"""Autonomous Vehicle Traffic Scenario Image Generation Recipe
+
+Generate synthetic autonomous-vehicle ego-camera images with controlled
+variation over region, road type, weather, time of day, traffic density,
+surface condition, traffic controls, and long-tail scenario elements. Use the
+generated images for perception review sets, visual QA, or
+simulator-validation prompts.
+
+Synthetic images are not a replacement for real sensor logs, simulator runs, or
+safety validation. They are useful for rapidly creating controlled visual
+examples around rare or hazardous conditions.
+
+Prerequisites:
+ - An image-generation provider key for the selected model. The defaults use
+ OpenRouter, so set OPENROUTER_API_KEY before running.
+
+Run:
+ uv run traffic_scenarios.py --num-records 10
+"""
+
+from __future__ import annotations
+
+import argparse
+from pathlib import Path
+
+import data_designer.config as dd
+from data_designer.interface import DataDesigner, DatasetCreationResults
+
+DEFAULT_MODEL_PROVIDER = "openrouter"
+DEFAULT_MODEL_ID = "google/gemini-3.1-flash-image-preview"
+DEFAULT_MODEL_ALIAS = "traffic-scene-model"
+
+
+def build_model_configs(
+ *,
+ model_provider: str,
+ model_id: str,
+ model_alias: str,
+ image_size: str,
+ aspect_ratio: str,
+ max_parallel_requests: int,
+) -> list[dd.ModelConfig]:
+ """Build a provider-agnostic image-generation model config."""
+ return [
+ dd.ModelConfig(
+ alias=model_alias,
+ model=model_id,
+ provider=model_provider,
+ inference_parameters=dd.ImageInferenceParams(
+ extra_body={
+ "n": 1,
+ "generationConfig": {
+ "imageConfig": {
+ "aspectRatio": aspect_ratio,
+ "imageSize": image_size,
+ }
+ },
+ },
+ max_parallel_requests=max_parallel_requests,
+ ),
+ skip_health_check=True,
+ )
+ ]
+
+
+def add_category(config_builder: dd.DataDesignerConfigBuilder, name: str, values: list[str]) -> None:
+ """Add a categorical sampler column."""
+ config_builder.add_column(
+ dd.SamplerColumnConfig(
+ name=name,
+ sampler_type=dd.SamplerType.CATEGORY,
+ params=dd.CategorySamplerParams(values=values),
+ )
+ )
+
+
+def build_config(
+ *,
+ model_provider: str = DEFAULT_MODEL_PROVIDER,
+ model_id: str = DEFAULT_MODEL_ID,
+ model_alias: str = DEFAULT_MODEL_ALIAS,
+ image_size: str = "1024",
+ aspect_ratio: str = "16:9",
+ max_parallel_requests: int = 10,
+) -> dd.DataDesignerConfigBuilder:
+ """Build an autonomous-vehicle ego-camera image-generation pipeline."""
+ model_configs = build_model_configs(
+ model_provider=model_provider,
+ model_id=model_id,
+ model_alias=model_alias,
+ image_size=image_size,
+ aspect_ratio=aspect_ratio,
+ max_parallel_requests=max_parallel_requests,
+ )
+ config_builder = dd.DataDesignerConfigBuilder(model_configs=model_configs)
+
+ add_category(
+ config_builder,
+ "geographic_region",
+ [
+ "US - dense urban (NYC-style)",
+ "US - sprawling suburban (Los Angeles-style)",
+ "US - rural Midwest",
+ "Europe - narrow streets (Italian/French town)",
+ "Europe - orderly infrastructure (German autobahn)",
+ "Asia - mixed traffic (India/Thailand)",
+ "Asia - modern cityscape (Singapore/Tokyo)",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "road_type",
+ [
+ "urban city street with tall buildings",
+ "urban street with mixed retail/residential",
+ "suburban residential street with trees",
+ "suburban commercial strip with parking lots",
+ "highway - 3 lanes each direction",
+ "highway - 5 lanes each direction with HOV",
+ "rural two-lane country road",
+ "rural highway with sparse markings",
+ "mountain road with curves and guardrails",
+ "coastal road with scenic views",
+ "bridge - suspension or arch style",
+ "bridge - concrete overpass",
+ "tunnel - well-lit with ceiling lights",
+ "tunnel - dim lighting",
+ "parking lot - shopping center",
+ "parking garage - multi-level",
+ "intersection - 4-way with traffic lights",
+ "intersection - complex 6-way",
+ "roundabout - single lane",
+ "roundabout - multi-lane",
+ "construction zone with detour",
+ "school zone with crossing",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "weather",
+ [
+ "clear sunny day",
+ "partly cloudy",
+ "overcast with gray skies",
+ "light rain - misty windshield",
+ "moderate rain - active wipers",
+ "heavy rain - reduced visibility under 100ft",
+ "light fog - moderate visibility",
+ "dense fog - visibility under 50ft",
+ "light snow - flurries",
+ "moderate snow - accumulating on road",
+ "heavy snow - whiteout conditions",
+ "sleet/freezing rain",
+ "dust storm - desert conditions",
+ "high winds - debris visible",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "time_of_day",
+ [
+ "dawn - pre-sunrise twilight",
+ "early morning - golden hour light",
+ "mid-morning - bright sun, long shadows",
+ "midday - overhead sun, minimal shadows",
+ "afternoon - sun starting to lower",
+ "late afternoon - golden hour",
+ "dusk - post-sunset twilight",
+ "night - well-lit with streetlights",
+ "night - moderately lit urban",
+ "night - poorly lit rural",
+ "night - headlights only, no street lighting",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "traffic_density",
+ [
+ "empty - no other vehicles visible",
+ "sparse - 1-2 vehicles in distance",
+ "light - 3-5 vehicles visible",
+ "moderate - steady flow of traffic",
+ "heavy - congested, slow-moving",
+ "stop-and-go - bumper-to-bumper",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "vehicle_mix",
+ [
+ "sedans and compact cars",
+ "mix of cars and SUVs",
+ "includes large trucks/semi-trailers",
+ "includes buses",
+ "includes motorcycles and scooters",
+ "includes bicycles and e-bikes",
+ "includes delivery vans/box trucks",
+ "mixed vehicle types - diverse traffic",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "scenario_element",
+ [
+ "pedestrian crossing at marked crosswalk",
+ "pedestrian jaywalking mid-block",
+ "pedestrian with stroller/wheelchair",
+ "group of pedestrians crossing",
+ "child chasing ball toward street",
+ "jogger/runner on shoulder",
+ "pedestrian wearing dark clothing at night",
+ "pedestrian with umbrella obscuring face in rain",
+ "elderly pedestrian crossing slowly with walker",
+ "pedestrian distracted by phone while crossing",
+ "pedestrians exiting parked bus on roadside",
+ "crowd spilling onto road from sidewalk event",
+ "cyclist in dedicated bike lane",
+ "cyclist merging into traffic lane",
+ "cyclist making left turn",
+ "cyclist riding against traffic on wrong side",
+ "e-scooter rider weaving between cars",
+ "e-scooter rider on sidewalk entering crosswalk",
+ "group of cyclists in paceline on shoulder",
+ "cyclist with cargo trailer taking full lane",
+ "motorcycle lane splitting",
+ "motorcycle filtering through stopped traffic",
+ "motorcycle approaching from blind spot",
+ "school bus with stop sign extended and flashing",
+ "ambulance approaching with lights and sirens",
+ "police vehicle with lights activated",
+ "fire truck in oncoming lane",
+ "emergency vehicle approaching from behind in mirror",
+ "tow truck loading vehicle on roadside",
+ "construction zone - workers present with cones",
+ "construction zone - lane closure with signs",
+ "road crew filling potholes with equipment in lane",
+ "utility workers with cherry picker blocking lane",
+ "temporary steel plates covering road excavation",
+ "stopped vehicle - hazard lights on shoulder",
+ "vehicle broken down in lane",
+ "disabled vehicle in lane with warning triangle",
+ "vehicle stalled in intersection",
+ "vehicle with open hood - person inspecting engine",
+ "vehicle suddenly braking ahead",
+ "vehicle making unexpected lane change without signal",
+ "vehicle backing out of parking spot",
+ "vehicle running red light from cross street",
+ "vehicle driving wrong way on one-way street",
+ "vehicle making illegal U-turn",
+ "vehicle swerving to avoid pothole",
+ "vehicle drifting out of lane (distracted driver)",
+ "vehicle cutting in from merging lane aggressively",
+ "slow-moving vehicle (farm equipment/golf cart) on road",
+ "delivery truck double-parked with flashers",
+ "garbage truck with crew working",
+ "parked car door opening into traffic",
+ "semi-truck jackknifed across lanes",
+ "wide-load vehicle with escort car",
+ "ice cream truck stopped with children nearby",
+ "ride-share vehicle stopped abruptly for pickup",
+ "moving truck partially blocking lane while loading",
+ "food truck parked on street with customer queue",
+ "animal (deer) on roadside",
+ "animal (dog) loose on road",
+ "flock of birds on road surface",
+ "animal (coyote/fox) darting across road",
+ "fallen tree branch partially blocking lane",
+ "debris/cargo on road surface",
+ "large pothole in driving lane",
+ "standing water/flooded section of road",
+ "oil spill or fluid on road surface",
+ "tire tread/retread debris on highway",
+ "mattress or furniture fallen from truck on road",
+ "manhole cover missing or displaced",
+ "traffic cone or barrel knocked into lane",
+ "sun glare directly ahead through windshield",
+ "headlight glare from oncoming vehicle at night",
+ "spray/mist from vehicle ahead on wet road",
+ "shadow from overpass creating sudden darkness",
+ "smoke from nearby fire drifting across road",
+ "reflection of wet road creating mirror effect",
+ "traffic light malfunctioning - flashing red",
+ "obscured traffic sign by overgrown vegetation",
+ "contradictory road signs at intersection",
+ "pedestrian signal countdown with people still crossing",
+ "railroad crossing with gates descending and lights flashing",
+ "toll booth approach with lanes merging",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "road_surface",
+ [
+ "dry asphalt - good condition",
+ "dry asphalt - faded lane markings",
+ "wet reflective surface",
+ "wet with puddles",
+ "icy patches visible",
+ "black ice conditions",
+ "snow-covered - lane markings obscured",
+ "gravel surface",
+ "unpaved dirt road",
+ "potholes and road damage visible",
+ "recent patching - uneven surface",
+ "construction - temporary markings",
+ ],
+ )
+
+ add_category(
+ config_builder,
+ "traffic_control",
+ [
+ "traffic light - green",
+ "traffic light - yellow/amber",
+ "traffic light - red",
+ "stop sign clearly visible",
+ "yield sign",
+ "speed limit sign - 25 mph",
+ "speed limit sign - 55 mph",
+ "no traffic control - uncontrolled intersection",
+ "construction signage and cones",
+ "temporary traffic lights",
+ ],
+ )
+
+ config_builder.add_column(
+ dd.ImageColumnConfig(
+ name="traffic_scene",
+ prompt=TRAFFIC_SCENE_PROMPT,
+ model_alias=model_alias,
+ )
+ )
+
+ return config_builder
+
+
+def create_dataset(
+ config_builder: dd.DataDesignerConfigBuilder,
+ *,
+ num_records: int,
+ dataset_name: str,
+ artifact_path: Path | str | None = None,
+) -> DatasetCreationResults:
+ data_designer = DataDesigner(artifact_path=artifact_path)
+ data_designer.validate(config_builder)
+ return data_designer.create(config_builder, num_records=num_records, dataset_name=dataset_name)
+
+
+TRAFFIC_SCENE_PROMPT = """\
+Create a photorealistic autonomous-vehicle ego-camera perception scene.
+
+The image must look like it was captured by a camera mounted on the self-driving
+ego vehicle, not by a roadside camera, drone, or cinematic photographer. Keep
+the viewpoint physically plausible for an AV sensor. When appropriate, show a
+subtle hood edge, windshield edge, or bumper edge, but do not show a full car
+interior or dashboard UI.
+
+Scene requirements:
+- Geographic region: {{ geographic_region }}
+- Road type: {{ road_type }}
+- Weather: {{ weather }}
+- Time of day: {{ time_of_day }}
+- Traffic density: {{ traffic_density }}
+- Vehicle mix: {{ vehicle_mix }}
+- Key scenario element: {{ scenario_element }}
+- Road surface: {{ road_surface }}
+- Traffic control: {{ traffic_control }}
+
+The scene should clearly show road geometry, lane markings, traffic signs,
+traffic control devices, surrounding vehicles, vulnerable road users when
+requested, and the key scenario element from the ego vehicle's camera. Preserve
+regional driving characteristics such as road width, side of road, sign style,
+and lane markings. Use realistic lighting, lens geometry, motion perspective,
+weather effects, and visibility. Generate exactly one final ego-camera image
+for this row. Do not return alternate versions, a grid, a pair of examples,
+before/after panels, or multiple camera frames. Do not include text overlays,
+labels, watermarks, dashcam timestamps, bounding boxes, sensor UI, or navigation
+UI.
+"""
+
+
+def parse_args() -> argparse.Namespace:
+ parser = argparse.ArgumentParser(description="Generate synthetic autonomous-vehicle traffic scenes.")
+ parser.add_argument("--num-records", type=int, default=10, help="Number of traffic scenes to generate.")
+ parser.add_argument("--dataset-name", default="synthetic-traffic-scenarios", help="Output dataset name.")
+ parser.add_argument("--artifact-path", type=Path, default=None, help="Optional Data Designer artifact directory.")
+ parser.add_argument("--model-provider", default=DEFAULT_MODEL_PROVIDER, help="Image model provider name.")
+ parser.add_argument("--model-id", default=DEFAULT_MODEL_ID, help="Provider model ID.")
+ parser.add_argument("--model-alias", default=DEFAULT_MODEL_ALIAS, help="Alias used by image columns.")
+ parser.add_argument("--image-size", default="1024", help="Provider-specific image size value.")
+ parser.add_argument("--aspect-ratio", default="16:9", help="Provider-specific aspect ratio value.")
+ parser.add_argument("--max-parallel-requests", type=int, default=10, help="Maximum parallel image requests.")
+ return parser.parse_args()
+
+
+def main() -> None:
+ args = parse_args()
+ config_builder = build_config(
+ model_provider=args.model_provider,
+ model_id=args.model_id,
+ model_alias=args.model_alias,
+ image_size=args.image_size,
+ aspect_ratio=args.aspect_ratio,
+ max_parallel_requests=args.max_parallel_requests,
+ )
+ results = create_dataset(
+ config_builder,
+ num_records=args.num_records,
+ dataset_name=args.dataset_name,
+ artifact_path=args.artifact_path,
+ )
+ dataset = results.load_dataset()
+ print(f"Generated {len(dataset)} synthetic traffic-scene rows.")
+ print(f"Dataset artifacts: {results.artifact_storage.base_dataset_path}")
+
+
+if __name__ == "__main__":
+ main()
+```