Welcome to Spyre Inference¶
IBM Spyre is the first production-grade Artificial Intelligence Unit (AIU) accelerator born out of the IBM Research AIU family. It is part of a long-term strategy of developing novel architectures and full-stack technology solutions for the emerging space of generative AI. Spyre builds on the foundation of IBM's internal AIU research and delivers a scalable, efficient architecture for accelerating AI in enterprise environments.
spyre-inference is a vLLM platform plugin that enables seamless integration of IBM Spyre accelerators with vLLM via the torch-spyre PyTorch backend. It is the next evolution of sendnn-inference, leveraging PyTorch's native Inductor compiler backend through vLLM's plugin architecture.
For more information, check out the following: