Untether AI Releases Early Access to imAIgine Software Development Kit Supporting speedAI Inference Acceleration Solutions
2024年7月18日 - 1:22AM
ビジネスワイヤ(英語)
Untether AI®, the leader in energy-centric AI inference
acceleration, today announced the availability of early access (EA)
of its imAIgine® Software Development Kit (SDK) supporting the
speedAI® inference acceleration solutions.
The imAIgine SDK provides a push-button flow, streamlining the
process of converting trained neural network models into optimized,
inference-ready models to be run on speedAI acceleration solutions.
This latest EA release supports the speedAI family of devices and
PCIe accelerator cards, which set a new industry benchmark of
energy efficiency and 2000 TFLOPs of AI inference performance per
device.
“Providing the early access version of the imAIgine SDK enables
users to prepare their neural networks for the upcoming shipment of
speedAI devices and cards,” said Philip Lewer, Sr. Director of
Product at Untether AI. “With an extensive array of model garden
and kernel support, automated compilation, and sophisticated
analysis tools, this EA release gives users everything they need to
easily deploy their models on the revolutionary speedAI family of
inference acceleration solutions.”
Push-button flow for simple model deployment
The imAIgine SDK provides an automated path to running neural
networks on Untether AI’s inference acceleration solutions, with
push-button quantization, optimization, physical allocation, and
multi-chip partitioning. Supporting either TensorFlow or PyTorch, a
few simple python commands quantize, lower, physically allocate,
and run the models on speedAI hardware in a matter of minutes. With
a comprehensive model garden library and kernel support users can
quickly run classification, object detection, semantic
segmentation, or natural language processing (NLP) models on
speedAI hardware. Sophisticated, automated quantization techniques
convert the neural network to the preferred datatype. For the
utmost in accuracy, post-quantization training (PQT) and knowledge
distillation algorithms are available to maintain accuracy after
quantization. During compilation the imAIgine SDK performs
layer-fusion optimizations, graph-lowering, kernel mapping, and
physical allocation to provide an optimal implementation
result.
Power-user flow for low-level optimizations
With the power-user flow, users can directly develop optimized
“bare metal” kernels for the over 1,400 RISC-V processors and over
350,000 at-memory compute processing elements in speedAI devices.
Analogous to CUDA, but written in familiar C/C++, these kernels are
directly compiled using a modified version of LLVM, enhanced to
take advantage of the over 30 custom instructions Untether AI has
added to the instruction set for its ultra-efficient at-memory
compute architecture. Users can then manually place the kernels in
any topology on the memory banks of the speedAI spatial
architecture.
Extensive suite of analysis tools including virtual hardware
Within the imAIgine SDK there are several tools to analyze how
networks are running on the speedAI devices, providing a virtual
hardware view prior to receiving actual devices. The Model Explorer
shows the entire floorplan of how the neural network is mapped to
the silicon, enabling interactive inspection of connection
topology, socket depth, and performance estimates. This can be
enhanced by the Analysis Dashboard to provide information on
processor activity, packet exchanges, and utilization. All of these
tools provide a virtual hardware environment to help guide the user
for optimal efficiency and performance.
Untether AI invites prospective customers and partners to
explore the transformative potential of speedAI and the imAIgine
SDK. To gain early access and start developing with the imAIgine
SDK for the speedAI family, please visit
https://www.untether.ai/imaigine-sdk-early-access-program/ and
request download privileges.
About Untether AI
Untether AI® provides energy-centric AI inference acceleration
from the edge to the cloud, supporting any type of neural network
model. With its at-memory compute architecture, Untether AI has
solved the data movement bottleneck that costs energy and
performance in traditional CPUs and GPUs, resulting in
high-performance, low-latency neural network inference acceleration
without sacrificing accuracy. Untether AI embodies its technology
in runAI® and speedAI® devices, tsunAImi® acceleration cards, and
its imAIgine® Software Development Kit. More information can be
found at www.untether.ai.
All references to Untether AI trademarks are
the property of Untether AI. All other trademarks mentioned herein
are the property of their respective owners.
View source
version on businesswire.com: https://www.businesswire.com/news/home/20240717282026/en/
Michelle Clancy Fuller, Cayenne Global, LLC
Michelle.clancy@cayennecom.com 1-503-702-4732