GLoW Reimplementation
Clean reimplementation of the Global Latent Workspace architecture in PyTorch, with modular domain encoders and a shared latent broadcast space.
Context
The Global Latent Workspace (GLoW) is an architecture developed at the CerCo lab (CNRS, Toulouse) under Rufin VanRullen’s ERC Advanced Grant. It’s inspired by Global Workspace Theory — a neuroscience framework proposing that consciousness arises from a shared “workspace” where specialized brain modules broadcast information to each other.
The original implementation is built on the Shimmer framework. This project is my clean-room reimplementation from the paper, designed to deeply understand every component and explore extensions.
Architecture
The system consists of:
- Domain-specific encoders — each modality (vision, language, proprioception) has its own encoder that maps raw data into a latent representation
- A global workspace — a shared latent space where domain representations are projected, fused, and broadcast back
- Contrastive alignment — domains are aligned in the workspace using contrastive losses to ensure cross-modal coherence
Current status
Working on the base architecture with two modalities (vision + language). Next steps include adding proprioceptive encoders and testing cross-modal generation.
Why this matters
If we want machines that understand the world the way brains do — through integrated, multimodal internal models — we need architectures that go beyond single-task learning. GLoW is one of the most promising frameworks for this, and reimplementing it is the fastest way to understand its strengths and limitations.