CARINOX Icon CARINOX
Category-Aware Reward-based Initial Noise Optimization and Exploration

1Department of Computer Engineering, Sharif University of Technology

Unifying noise optimization and best-of-N exploration with systematic reward selection for reliable compositional alignment.



Abstract

Text-to-image diffusion models, such as Stable Diffusion, can produce high-quality and diverse images but often fail to achieve compositional alignment, particularly when prompts describe complex object relationships, attributes, or spatial arrangements. Recent inference-time approaches address this by optimizing or exploring the initial noise under the guidance of reward functions that score text–image alignment—without requiring model fine-tuning. While promising, each strategy has intrinsic limitations when used alone: optimization can stall due to poor initialization or unfavorable search trajectories, whereas exploration may require a prohibitively large number of samples to locate a satisfactory output. Our analysis further shows that neither single reward metrics nor ad-hoc combinations reliably capture all aspects of compositionality, leading to weak or inconsistent guidance. To overcome these challenges, we present Category-Aware Reward-based Initial Noise Optimization and EXploration (CARINOX), a unified framework that combines noise optimization and exploration with a principled reward selection procedure grounded in correlation with human judgments. Evaluations on two complementary benchmarks—covering diverse compositional challenges—show that CARINOX raises average alignment scores by +16% on T2I-CompBench++ and +11% on the HRS benchmark, consistently outperforming state-of-the-art optimization and exploration-based methods across all major categories, while preserving image quality and diversity.

BibTeX

@article{
  kasaei2026carinox,
  title={{CARINOX}: Inference-time Scaling with Category-Aware Reward-based Initial Noise Optimization and Exploration},
  author={Seyed Amir Kasaei and Ali Aghayari and Arash Marioriyad and Niki Sepasian and Shayan Baghayi Nejad and MohammadAmin Fazli and Mahdieh Soleymani Baghshah and Mohammad Hossein Rohban},
  journal={Transactions on Machine Learning Research},
  issn={2835-8856},
  year={2026},
  url={https://openreview.net/forum?id=XB1cwXHV0c},
  note={}
}