Machine Learning At Your Service

by Hugging Face

Easily deploy Transformers, Diffusers or any model on dedicated, fully managed infrastructure. Keep your costs low with our secure, compliant and flexible production solution.

No Hugging Face account ? Sign up!

One-click inference deployment

Import your favorite model from the Hugging Face hub or browse our catalog of hand-picked, ready-to-deploy models !

Author avatar

Llama-2-70B-chat-GPTQ

TGI
Text Generation
TheBloke

A 70-billion parameter model from Meta, optimized for dialogue. Generates helpful, safe responses and outperforms other open-source chat LLMs.

$
8 / h
Go
Author avatar

Falcon-180B-Chat-GPTQ

TGI
Text Generation
TheBloke

A 180-billion parameter conversational AI model optimized for fast inference through an efficient architecture. Freely available under TII LICENSE.

$
8 / h
Go
Text-to-Image
stabilityai

Latent Diffusion model from Stability AI for high-quality, diverse image generation based on short text prompts provided by the user.

$
0.8 / h
Go

Customer Stories

Learn how leading AI teams use Inference Endpoints to deploy their models

Endpoints for Music

Musixmatch is the world’s leading music data company

Use Case

Custom text embeddings generation pipeline

Models Deployed
  • Distilbert-base-uncased-finetuned-sst-2-english
  • facebook/wav2vec2-base-960h
  • Custom model based on sentence transformers
The coolest thing was how easy it was to define a complete custom interface from the model to the inference process. It just took us a couple of hours to adapt our code, and have a functioning and totally custom endpoint.
Portrait of Andrea Boscarino, Data Scientist at Musixmatch
Andrea Boscarino
Data Scientist at Musixmatch

Pricing

Choose a plan that fits your needs

Self-Serve

Pay as you go when using Inference Endpoints

  • Pay for what you use, per minute
  • Starting as low as $0.06/hour
  • Billed monthly
  • Email support
See Pricing

Enterprise

Get a custom quote and premium support

  • Lower marginal costs based on volume
  • Uptime guarantees
  • Custom annual contracts
  • Dedicated support, SLAs
Request a Quote