TensorOpera and Aethir Team Up to Advance Massive-Scale LLM Training on Decentralized Cloud
2024年6月20日 - 11:28PM
ビジネスワイヤ(英語)
TensorOpera, the company providing “Your Generative AI Platform
at Scale,” has partnered with Aethir, a distributed cloud
infrastructure provider, to accelerate its newest foundation model,
TensorOpera Fox-1, highlighting the first mass-scale LLM training
use case on a decentralized physical infrastructure network.
This press release features multimedia. View
the full release here:
https://www.businesswire.com/news/home/20240620344957/en/
(Graphic: Business Wire)
Introduced last week, TensorOpera Fox-1 is a cutting-edge
open-source small language model (SLM) with 1.6 billion parameters,
outperforming other models in its class from tech giants like
Apple, Google, and Alibaba. This decoder-only transformer was
trained from scratch on three trillion tokens using a novel 3-stage
curriculum. It features an innovative architecture that is 78%
deeper than comparable models such as Google's Gemma 2B and
surpasses competitors in standard LLM benchmarks like GSM8k and
MMLU, even with significantly fewer parameters.
The partnership with Aethir equips TensorOpera with advanced GPU
resources necessary for training Fox-1. Aethir's collaboration with
NVIDIA Cloud Partners, Infrastructure Funds, and various
enterprise-grade hardware providers has established a global,
large-scale GPU cloud. This network ensures the delivery of
cost-effective and scalable GPU resources, essential for
high-throughput, substantial memory capacity, and efficient
parallel processing capabilities. With the support of Aethir’s
decentralized cloud infrastructure, TensorOpera obtains the
necessary tools for facilitating streamlined AI development that
requires high network bandwidth and ample amounts of GPU power.
Through this collaboration, TensorOpera is further integrating a
pool of GPU resources from Aethir that can be used seamlessly via
TensorOpera's AI platform for a variety of jobs, such as model
deployment and serving, fine-tuning, and full training. With
Aethir’s distributed GPU cloud network, dynamically adjusting GPU
power consumption for AI platforms on the go is no issue. Together,
Aethir and TensorOpera aim to empower the next generation of large
language model (LLM) training and give AI developers the assets
they need to create powerful models and applications.
“I am thrilled about our partnership with Aethir,” said Salman
Avestimehr, Co-Founder and CEO of TensorOpera. “In the dynamic
landscape of generative AI, the ability to efficiently scale up and
down during various stages of model development and in-production
deployment is essential. Aethir’s decentralized infrastructure
offers this flexibility, combining cost-effectiveness with
high-quality performance. Having experienced these benefits
firsthand during the training of our Fox-1 model, we decided to
deepen our collaboration by integrating Aethir's GPU resources into
TensorOpera's AI platform to empower developers with the resources
necessary for pioneering the next generation of AI
technologies."
Aethir’s operational model is based on a globally distributed
network of top-shelf GPUs capable of effectively servicing
enterprise clients in the AI and machine learning industry
regardless of their physical locations. To effectively provide
lag-free, highly scalable GPU power worldwide, Aethir’s GPU
resources are decentralized across a multitude of locations in
smaller clusters. Instead of pooling resources in a few massive
data centers like in the case of traditional, centralized cloud
service providers, Aethir distributes its infrastructure to cover
the network’s edge and cut the physical distance between GPU
resources and end-users.
"TensorOpera is the premier AI platform for LLMs and generative
AI applications, and we are excited to be their supplier of
enterprise GPU infrastructure," said Kyle Okamoto, CTO of
Aethir.
“Aethir is firmly dedicated to supporting the AI and machine
learning sector in developing and launching groundbreaking
solutions that can improve the everyday lives of people around the
world. TensorOpera provides developers with a comprehensive AI
platform, while Aethir will provide them with a steady supply of
GPU power that can handle even the most demanding LLM training and
AI inference. Thanks to our vast decentralized cloud
infrastructure, Aethir is capable of powering large-scale AI
development and deployment worldwide,” said Daniel Wang, Aethir’s
CEO.
We invite Generative AI model builders and application
developers to TensorOpera AI platform, where they can seamlessly
train, deploy, and serve their models using high-quality H100 GPUs
newly available through this partnership.
About TensorOpera, Inc.
TensorOpera, Inc. (formerly FedML, Inc.) is an innovative AI
company based in Silicon Valley, specifically Palo Alto,
California. TensorOpera specializes in developing scalable and
secure AI platforms, offering two flagship products tailored for
enterprises and developers. The TensorOpera® AI Platform, available
at TensorOpera.ai, is a comprehensive generative AI platform for
model deployment and serving, model training and fine-tuning, AI
agent creation, and more. It supports launching training and
inference jobs on a serverless/decentralized GPU cloud,
experimental tracking for distributed training, and enhanced
security and privacy measures. The TensorOpera® FedML Platform,
accessible at FedML.ai, leads in federated learning and analytics
with zero-code implementation. It includes a lightweight,
cross-platform Edge AI SDK suitable for edge GPUs, smartphones, and
IoT devices. Additionally, it offers a user-friendly MLOps platform
to streamline decentralized machine learning and deployment in
real-world applications. Founded in February 2022, TensorOpera has
quickly grown to support a large number of enterprises and
developers worldwide.
About Aethir.
Aethir is the only Enterprise-grade AI-focused GPU-as-a-service
provider in the market. Its decentralized cloud computing
infrastructure allows GPU providers to meet Enterprise clients who
need powerful GPU chips for professional AI/ML tasks. Thanks to a
constantly growing network of over 40,000 top-shelf GPUs, including
over 3,000 NVIDIA H100s, Aethir can provide enterprise-grade GPU
computing wherever it’s needed, at scale.
Backed by leading Web3 investors like Framework Ventures, Merit
Circle, Hashkey, Animoca Brands, Sanctor Capital, Infinity Ventures
Crypto (IVC), and others, with over $140M in funds raised for the
ecosystem, Aethir is paving the way for the future of decentralized
computing.
View source
version on businesswire.com: https://www.businesswire.com/news/home/20240620344957/en/
contact@tensoropera.com