CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via Dynamically Optimizing 3D Gaussians
Recent breakthroughs in text-guided image generation have significantly advanced the field of 3D generation. While generating a single high-quality 3D object is now feasible, generating multiple objects with reasonable interactions within a 3D space, a.k.a. compositional 3D generation, presents substantial challenges. This paper introduces CompGS, a novel generative framework that employs 3D Gaussian Splatting (GS) for efficient, compositional text-to-3D content generation. To achieve this goal, two core designs are proposed: (1) 3D Gaussians Initialization with 2D compositionality: We transfer the well-established 2D compositionality to initialize the Gaussian parameters on an entity-by-entity basis, ensuring both consistent 3D priors for each entity and reasonable interactions among multiple entities; (2) Dynamic Optimization: We propose a dynamic strategy to optimize 3D Gaussians using Score Distillation Sampling (SDS) loss. CompGS first automatically decomposes 3D Gaussians into distinct entity parts, enabling optimization at both the entity and composition levels. Additionally, CompGS optimizes across objects of varying scales by dynamically adjusting the spatial parameters of each entity, enhancing the generation of fine-grained details, particularly in smaller entities. Qualitative comparisons and quantitative evaluations on T3Bench demonstrate the effectiveness of CompGS in generating compositional 3D objects with superior image quality and semantic alignment over existing methods. CompGS can also be easily extended to controllable 3D editing, facilitating scene generation. We hope CompGS will provide new insights to the compositional 3D generation.
近年来,文本引导的图像生成取得了显著突破,大幅推进了三维生成领域的发展。尽管生成单个高质量的三维物体已变得可行,但在三维空间中生成具有合理交互的多个物体(即组合性三维生成)仍面临巨大挑战。本文提出了 CompGS,一种新颖的生成框架,通过三维高斯喷涂 (3D Gaussian Splatting, GS) 实现高效的组合性文本到三维内容生成。 为实现这一目标,我们提出了两个核心设计:(1) 具有二维组合性的三维高斯初始化:我们将成熟的二维组合性转移到高斯参数的实体级别初始化上,确保每个实体具有一致的三维先验,并实现多实体间合理的交互;(2) 动态优化:我们提出了一种基于得分蒸馏采样 (SDS) 损失的动态策略,用于优化三维高斯。CompGS 首先自动将三维高斯分解为不同的实体部分,从而在实体和组合层面上进行优化。此外,CompGS 通过动态调整每个实体的空间参数来优化不同尺度的物体,增强了细粒度细节的生成,尤其是在较小实体的生成上。 在 T3Bench 数据集上的定性比较和定量评估表明,CompGS 在生成组合性三维物体方面优于现有方法,具有更高的图像质量和语义对齐度。此外,CompGS 也可轻松扩展到可控的三维编辑,促进场景生成。我们希望 CompGS 能为组合性三维生成提供新的见解。