NO.226 The Power of Geometric Algebra in Modern Computer Vision

May 19 - 22, 2025 (Check-in: May 18, 2025 )

Organizers

Diego Thomas
- Kyushu University, Japan
Vincent Nozick
- Université Gustave Eiffel, France
Takuya Funatomi
- Nara Institute of Science and Technology, Japan

Overview

In the last decade, deep learning has become ubiquitous in Computer Vision to solve various tasks ranging from object recognition to 3D and 4D capture. A vast majority of recent scientific papers published in top level venues such as the international conference on Computer Vision and Pattern Recognition (CVPR), the International Conference on Computer Vision (ICCV) or the IEEE Transaction on Pattern Analysis and Machine Intelligence (TPAMI) rely on deep neural models formulated in the linear algebra. This is because of the simplicity of the basic operations in the neurons: scalar multiplication and addition, and the availability of many existing tools such as auto differentiation in pytorch. However, for some tasks such as 3D transformation, other algebras such as Clifford Algebra (also called Geometric Algebra) have proven to be more efficient. Deep neural
networks formulated with geometric algebra have been proposed recently, with promising applications in Computer Graphics. However, none has been applied yet to solve Computer Vision tasks.

The meeting is envisaged to focus on new settings and applications of AI models defined with Geometric algebra for tasks in Computer Vision such as 3D and 4D capture. The meeting will invite internationally leading and renown researchers in the fields of Geometric Algebra, Computer Vision and Computer Graphics, whose contributions are likely to be essential to the field. We anticipate that the meeting will foster discussions and new ideas to open a new research area at the frontier between Geometric Algebra, Computer Vision and Computer Graphics. The meeting will also be a wonderful opportunity to strengthen existing collaborations and create new collaborations between top-level researchers in Japan and abroad.

The meeting will focus on the following two topics.

1. Deep learning with Geometric Algebra

Equipping deep neural networks with geometric priors has achieved several successes ([1], [2]) to model geometric transformations of 3D shapes. Some very recent Geometric Algebra based networks, such as Clifford Neural Layers, embed some powerful properties like equivariance (i.e. invariance with respect to the 3D symmetry group) that strongly simplifies some standard networks. Other networks, like Geometric Clifford Algebra Network (GCANs), provide several interesting advantages over classical linear algebra that could open new possibilities in the field of Computer Vision: (1) GCANs naturally and efficiently encode the transformations and the invariant elements of classic geometries. (2) In Geometric Algebra, objects transform covariantly with transformations of space. This means that a single function can transform multiple types of objects, including vectors, points, lines and planes. (3) geometric algebra generalizes over dimensions in the sense that transformations and objects are constructed consistently regardless of the dimensionality of the space. The objective of this seminar is to give an overview of recent successes of deep learning with geometric priors and revisit classical Computer Vision problems from a geometric algebra perspective. A first target will be to clarify formulation of deep neural networks using geometric algebra and existing tools to implement them. A second objective will be to find promising future research directions and concrete applications. For this purpose, the seminar will invite international experts in the domain of geometric algebra, computer vision and computer graphics.

2. 3D and 4D capture with the support of Geometric AI

Research on Neural Radiance Fields (NeRF) has recently gained significant attention within the realms of 3D Vision and Computer Graphics. Notably, the combination of NeRF with Signed Distance Fields (SDF) has demonstrated impressive outcomes in the field of multi-view 3D scene reconstruction. There is a current surge in exploring the adaptation of established concepts for reconstructing dynamic scenes. In this context, the introduction of neural warp fields has been proposed to capture non-rigid 3D shape transformations across multiple images. While recent findings show promise, they are constrained by their efficacy primarily in handling minor motions and often demand extensive training periods. Furthermore, these models exhibit limitations in interpolation and extrapolation capabilities. The fact that Geometric Algebra is especially performance for such interpolation tasks is promising to find new solutions and improve current state-of-the-art.

The primary objective of this seminar is to provide a comprehensive overview of existing alternatives for modeling 3D non-rigid deformations. The initial focus will involve exploring novel formulations of 3D neural warp fields capable of learning to deform 3D shapes and volumes based on input images and videos. A secondary aim is to identify exciting new applications of Geometric AI, specifically employing deep learning with Geometric Algebra, for various tasks in Computer Vision. This involves elucidating the advantages of geometric algebra over classical linear algebra and highlighting the available tools for implementing concrete solutions.

To achieve these goals, the seminar aims to bring together participants from diverse backgrounds. This includes experts in Geometric Algebra with a keen interest in its applications in Computer Vision, as well as specialists in Computer Vision intrigued by alternative approaches to modeling 3D shape transformations and manipulations.

References

[1] Brehmer, Johann, Pim De Haan, Sönke Behrends, and Taco Cohen. "Geometric Algebra Transformers." arXiv preprint arXiv:2305.18415 (2023).

[2] Brandstetter, Johannes, Rianne van den Berg, Max Welling, and Jayesh K. Gupta. "Clifford neural layers for PDE modeling." arXiv preprint arXiv:2209.04934 (2022).

Seminars

NO.226 The Power of Geometric Algebra in Modern Computer Vision

Organizers

Overview

References