Welcome to Spyre Inference¶

Star Watch Fork

IBM Spyre is the first production-grade Artificial Intelligence Unit (AIU) accelerator born out of the IBM Research AIU family. It is part of a long-term strategy of developing novel architectures and full-stack technology solutions for the emerging space of generative AI. Spyre builds on the foundation of IBM's internal AIU research and delivers a scalable, efficient architecture for accelerating AI in enterprise environments.

spyre-inference is a vLLM platform plugin that enables seamless integration of IBM Spyre accelerators with vLLM via the torch-spyre PyTorch backend. It is the next evolution of sendnn-inference, leveraging PyTorch's native Inductor compiler backend through vLLM's plugin architecture.

For more information, check out the following:

📚 Meet the IBM Artificial Intelligence Unit
📽️ AI Accelerators: Transforming Scalability & Model Efficiency
🚀 Spyre Accelerator for IBM Z
🚀 Spyre Accelerator for IBM POWER