OSA: Echocardiography Video Segmentation via Orthogonalized State Update and Anatomical Prior-aware Feature Enhancement

 

IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

 

Rui Wang1,  Huisi Wu1*,  Jing Qin2

1Shenzhen University

 2The Hong Kong Polytechnic University

 

 

 

Abstract

Accurate and temporally consistent segmentation of the left ventricle from echocardiography videos is essential for estimating the ejection fraction and assessing cardiac function. However, modeling spatiotemporal dynamics remains difficult due to severe speckle noise and rapid non-rigid deformations. Existing linear recurrent models offer efficient in-context associative recall for temporal tracking, but rely on unconstrained state updates, which cause progressive singular value decay in the state matrix, a phenomenon known as rank collapse, resulting in anatomical details being overwhelmed by noise. To address this, we propose OSA, a framework that constrains the state evolution on the Stiefel manifold. We introduce the Orthogonalized State Update (OSU) mechanism, which formulates the memory evolution as Euclidean projected gradient descent on the Stiefel manifold to prevent rank collapse and maintain stable temporal transitions. Furthermore, an Anatomical Prior-aware Feature Enhancement module explicitly separates anatomical structures from speckle noise through a physics-driven process, providing the temporal tracker with noise-resilient structural cues. Comprehensive experiments on the CAMUS and EchoNet-Dynamic datasets show that OSA achieves state-of-the-art segmentation accuracy and temporal stability, while maintaining real-time inference efficiency for clinical deployment. Codes are available at https://github.com/wangrui2025/OSA .

 

 

Figure 1: Illustration of different spatiotemporal memory update paradigms for echocardiography video segmentation. Top: Memory bank methods rely on sparse key-frame retrieval. Middle: Linear recurrent models perform unconstrained element-wise updates. Bottom: Our OSA enforces a Stiefel manifold constraint, yielding stable and drift-free tracking across the cardiac cycle.

 

 

Figure 2: Challenges in echocardiography video segmentation. (a–b) Red boxes indicate speckle noise and blue boxes indicate indistinct or blurred contours; (c–f) Large shape and scale variations across the cardiac cycle.

 

 

Figure 3: Visual comparison with state-of-the-art methods on CAMUS (top two rows) and EchoNet-Dynamic (bottom two rows). Green, red, and yellow regions represent the ground truth, prediction, and overlapping regions, respectively.

 

Acknowledgement

This work was supported partly by National Natural Science Foundation of China (No. 62273241), Natural Science Foundation of Guangdong Province, China (No. 2024A1515011946), the Shenzhen Research Foundation for Basic Research, China (No. JCYJ20250604181940054), and the grant under Hong Kong RGC Collaborative Research Fund (project no C5055-24G).

 

Downloads