Lumai Unveils World’s First Optical Computing System for Real-Time, Billion-Parameter LLM Inference

Posted  by GoPhotonics

712370

Lumai, the optical compute company addressing scalable AI, announced its Lumai Iris inference server - the world’s first optical computing system to successfully run billion-parameter large language models (LLMs) in real time. This marks a milestone in AI infrastructure, demonstrating for the first time the commercial viability of optical compute for large-scale AI inference workloads.

Lumai Iris servers accelerate inference workloads using light instead of silicon-based processing. Lumai’s optical compute system enables faster inference, higher execution efficiency, and up to 90% lower energy consumption than conventional architectures, while operating far more sustainably than traditional GPU-based systems. Lumai Iris consists of a family of servers: Nova, Aura, and Tetra. Lumai Iris Nova, the first server in the family, is available today for evaluation by hyperscalers, neo-clouds, enterprises, and research institutions.

Powering the Inference Era

AI has entered a new phase where deployment, not training, is defining real-world impact. However, as inference workloads surge, data centers are running up against hard power and scalability limits, and traditional silicon-based architectures are struggling to keep pace.

The Energy Wall

According to the International Energy Agency, global data center power demand will double by 2030. The finite power available to data centers is forcing the industry to seek more efficient approaches to compute. The Lumai Iris family of inference servers addresses this challenge with a new computing paradigm, delivering dramatically more performance per kilowatt to enable AI scaling without the prohibitive energy and cost burdens of existing systems.

The Silicon Ceiling

Traditional silicon architectures are hitting fundamental physical limits in scaling, power and thermal efficiency. Each new generation of silicon offers small scaling improvements while requiring significantly more power and cost to scale.

“As the industry transitions into the inference era, we are simultaneously crossing the threshold into the post-silicon era,” said Dr. Xianxin Guo, CEO and Co-Founder of Lumai. “By shifting the computation paradigm from electrons to photons, Lumai can deliver an order-of-magnitude increase in performance with significant energy savings.”

A New Architecture for AI Compute

Optical computing enables radically more efficient execution of core AI operations. Lumai’s optical computing technology, born from years of research at the University of Oxford, uses light in three-dimensional volume to overcome the two-dimensional constraints of conventional chips. By utilizing massive spatial parallelism, millions of operations are executed simultaneously, resulting in low-cost, high token throughput for compute-bound workloads.

Lumai’s technology also excels in the prefill stage of disaggregated inference architectures, processing tokens at maximum scale and efficiency. Iris Nova runs real-time inference on Llama 8B and 70B using a hybrid processor. Its sophisticated hybrid architecture combines digital processing for system control and software with an optical tensor engine that performs core mathematical operations. This approach ensures seamless integration into data centers.

The Advanced Research and Invention Agency (ARIA), a UK government-backed funder of advanced AI and other transformative innovation, commented, “The demands on existing AI processors necessitate an urgent search for alternative scaling pathways,” said Suraj Bramhavar, Program Director at ARIA. “Lumai is leading the charge in demonstrating that optical processors could provide one such pathway, and ARIA is excited to partner with them to explore the shift beyond our traditional digital computing paradigm.”

Availability

The Lumai Iris Nova inference server is available now for evaluation. Future systems in the Iris family will extend performance and efficiency further, supporting broader deployment across hyperscale and enterprise environments.

Click here to learn more about optical computing technology.


Advertisement
Advertisement