CogSENet:受鹰视觉启发的盲图像去模糊框架
阅读原文· arxiv.orgCogSENet是一种受鹰视觉启发的动态语义对齐重建框架,用于盲图像去模糊。核心模块包括:语义驱动状态空间模块(SDSSM),通过可微分路由实现语义感知的token重组与提示条件长程依赖建模;双频融合块(BFFB),用小波变换分解高低频特征,模拟鹰视网膜功能分化;连续模糊场(CBF),从模糊图像估计算法融合CLIP语义先验,调制深层潜特征以适应空间非均匀模糊。实验表明,CogSENet以更少参数在视觉质量和结构保真度上超越现有去模糊方法,并在去雾、去雨、去噪任务上表现良好。
Blind image deblurring demands the recovery of high-fidelity details and coherent structures from complex, unknown degradations. Current blind image deblurring methods struggle with real-world, spatially varying degradations, and lack the semantic awareness necessary to reliably differentiate valid textures from artifacts. To bridge this gap, we propose CogSENet, a dynamic, semantic-aligned reconstruction framework inspired by the eagle's visual system. By mimicking the eagle's active saccadic scanning, we devise a Semantic-Driven State Space Module (SDSSM) with semantic-aware token regrouping via differentiable routing, enabling prompt-conditioned long-range dependency modeling. To ensure physically interpretable recovery of textures and structures, a BiFreqFusionBlock (BFFB) mirrors functional differentiation of the eagle's retina by decomposing features into high and low frequencies using wavelet transforms. Finally, we estimate a continuous Blur Field (CBF) from blur image and fuse it with CLIP semantic priors to modulate the deepest latent features, emulating focal adaptation and enabling adaptive restoration under spatially non-uniform blur. Extensive experiments demonstrate that CogSENetoutperforms state-of-the-art deblurring methods in both visual quality and structural fidelity with fewer parameters, while also performing favorably on dehazing, deraining, and denoising tasks.