Technical Report · 2026

Let Your Data Build and Improve Itself via Goal-Driven Loop Agents

DataEvolver is an autonomous synthetic data construction framework where goal-driven loop agents orchestrate the full pipeline — from text-to-image generation through 3D reconstruction to scene-aware rendering — and iteratively refine output quality by reading VLM feedback, diagnosing visual failures, and adjusting rendering parameters until the data meets production standards.

350
Training Pairs
50
Unique 3D Objects
24
Atomic Actions
6
Pipeline Stages

Why Goal-Driven Agents Are the Missing Piece in Synthetic Data

Naive automated rendering produces artifacts — flat lighting, color shifts, floating objects, missing shadows. Traditional pipelines rely on rigid scoring rules that lack semantic understanding. Manual tuning doesn't scale. What's needed are goal-driven agents that can perceive, diagnose, and act — closing the loop between data generation and quality control.

Naive Renders Have Artifacts

Auto-rendered 3D objects often exhibit flat lighting, implausible shadows, and color mismatches with the scene environment.

Rule-Based QC Lacks Semantics

Rigid numeric thresholds can't diagnose "this lighting feels flat" or "the object appears to float." Goal-driven agents with VLM perception can.

Manual Tuning Can't Scale

Human artists spend minutes per object adjusting Blender parameters. At 50+ objects with 8+ viewpoints, this becomes infeasible.

The Goal-Driven Data Construction Pipeline

From a natural language seed concept to quality-verified rendered pairs — fully automated by goal-driven loop agents, no human intervention required between stages.

1
Text Expansion
LLM expands seed concept into detailed T2I prompt
2
T2I Generation
Qwen-Image-2512 generates 1024×1024 object image
2.5
Segmentation
SAM3 extracts RGBA foreground, removes background
3
3D Reconstruction
Hunyuan3D-2.1 reconstructs textured mesh from single image
4
Scene Rendering
Blender 4.24 + Cycles 512spp scene-aware insertion
5
VLM Review Loop
Free-form review → agent action → re-render until keep

Goal-Driven Loop Agents: Perceive, Diagnose, Act, Repeat

The heart of DataEvolver: not a scripted parameter scan, but a goal-driven loop agent that perceives rendered outputs via VLM review, diagnoses semantic issues like "flat lighting" or "weak grounding," selects targeted rendering adjustments from a structured action space, and repeats until quality goals are met.

Blender Render
Cycles 512spp at 1024×1024 with scene-aware lighting
VLM Review
Qwen3.5-35B-A3B free-form critique with thinking mode
AI Agent Decision
Reads reviewer text, selects action from 24-action space
Quality Gate
Reviewer verdict: keep / revise / reject
Loop continues with revised parameters until reviewer says keep

Anti-Oscillation Control

Sign-flip tracking (freeze after 3 flips), dead-zone detection, and step-scale scheduling (Round 0: 100%, Round 1: 70%, Round 2: 50%, Round 3+: 40%, with score-adaptive ×1.2 boost when hybrid_score < 0.65) prevent infinite loops and parameter thrashing.

Scene-Aware Rendering

Objects placed in real Blender scenes with HDRI environments. Raycast ground detection ensures physical plausibility. Original scene lighting preserved — no artificial studio setups.

24 Atomic Actions

Structured action space across 5 groups: lighting (key intensity & yaw), object (elevation, yaw, scale), scene (env rotation, env intensity, contact shadow), and material (saturation, value, hue, roughness, specular/sheen).

DataEvolver-Rotate: View-Controlled Rotation Editing

A benchmark dataset for rotation-conditioned image editing. Each sample pairs a canonical front-view image with a target view specified in natural language.

50
Unique Objects
8
Viewpoints per Object
350
Training Pairs (front→7 views)
3×A800
Training Infrastructure
Python — Load DataEvolver-Rotate Dataset
import json
from pathlib import Path
from PIL import Image

root = Path("dataset_scene_v7_full50_rotation8_trainready_front2others_splitobj_seed42_final_20260410")
rows = []
with (root / "pairs/train_pairs.jsonl").open("r") as f:
    for line in f:
        rows.append(json.loads(line))

row = rows[0]
source = Image.open(root / row["source_image"]).convert("RGB")
target = Image.open(root / row["target_image"]).convert("RGB")
instruction = row["instruction"]

24 Structured Atomic Actions

The AI agent selects from a discrete, structured action space to address VLM-identified issues. Each action targets a specific rendering parameter.

Key Light Intensity

×1.2 / ×0.8 multiplicative, bounded [0.5, 2.0]

Key Light Rotation (Yaw)

±15° step, bounded [-90°, 90°]

Env Light Intensity

×1.2 / ×0.8 multiplicative, bounded [0.5, 2.0]

Env Rotation (Z)

±30° step, bounded [-180°, 180°]

Object Elevation

±0.02 step, bounded [-0.1, 0.1]

Material Roughness

±0.08 step, bounded [-0.3, 0.6]

+ 18 more actions covering object yaw, object scale, contact shadow, color saturation, value/brightness, hue offset, specular/sheen, and more — see scene_action_space.json

Key Differentiators

What sets DataEvolver apart from other synthetic data pipelines and why goal-driven loop agents produce better training data.

VLM-as-Feedback, Not Score

Free-form natural language feedback provides semantic diagnosis that numeric scores cannot. The reviewer identifies why a render fails, not just that it fails.

AI Agent Autonomy

The agent reads raw review text and reasons about which action to take — no scripted rule engine, no score-threshold mapping. Full decision autonomy with bounded safety constraints.

Proven Downstream Value

LoRA fine-tuning on DataEvolver-Rotate demonstrably improves Qwen Image Edit 2511 on PSNR, SSIM, and LPIPS vs. the base model, confirming real training utility.

Multi-Modal Data Support

Beyond RGB pairs: mask, depth, normal maps, and geometry metadata — enabling research on multi-modal conditioning for image editing models.

Built With

The models, frameworks, and infrastructure powering the DataEvolver pipeline.

Python 3.10+ Blender 4.24 Cycles Path Tracing PyTorch 2.8 Qwen-Image-2512 SAM3 Hunyuan3D-2.1 Qwen3.5-35B-A3B Qwen-Image-Edit-2511 DiffSynth-Studio LoRA (PEFT) 3×A800 80GB

Quick Start

DataEvolver is designed to run on a Linux server with GPU access and Blender. Clone the repo and start building self-improving data pipelines.

Shell — Setup
git clone https://github.com/Kamisato520/DataEvolver.git
cd DataEvolver

# Explore the pipeline
ls pipeline/     # 6-stage data synthesis (stage1 → stage5)
ls configs/      # Action space, scene templates, dataset profiles
ls scripts/      # Agent monitor, dataset builders, eval tools

# Read the full documentation
cat CLAUDE.md     # Comprehensive project guide

Explore the Repository

Citation

If you use DataEvolver or DataEvolver-Rotate in your research, please cite our work.

BibTeX
@misc{dataevolver2026,
  title        = {{DataEvolver}: Let Your Data Build and Improve
                  Itself via Goal-Driven Loop Agents},
  author       = {Zhang, Qisong and Wu, Wenzhuo},
  year         = {2026},
  howpublished = {\url{https://github.com/Kamisato520/DataEvolver}},
  note         = {Technical Report}
}