Machine Learning At Your Service
by Hugging FaceEasily deploy Transformers, Diffusers or any model on dedicated, fully managed infrastructure. Keep your costs low with our secure, compliant and flexible production solution.
No Hugging Face account ? Sign up!
One-click inference deployment
Import your favorite model from the Hugging Face hub or browse our catalog of hand-picked, ready-to-deploy models !
Customer Stories
Learn how leading AI teams use Inference Endpoints to deploy their models
Endpoints for Music
Musixmatch is the world’s leading music data company
Custom text embeddings generation pipeline
- Distilbert-base-uncased-finetuned-sst-2-english
- facebook/wav2vec2-base-960h
- Custom model based on sentence transformers
The coolest thing was how easy it was to define a complete custom interface from the model to the inference process. It just took us a couple of hours to adapt our code, and have a functioning and totally custom endpoint.
Endpoints for Health
Phamily improves patient health with intelligent care management
HIPAA-compliant secure endpoints for text classification
- Custom model based on text-classification (MPNET)
- Custom model based on text-classification (BERT)
It took off a week's worth of developer time. Thanks to Inference Endpoints, we now basically spend all of our time on R&D, not fiddling with AWS. If you haven't already built a robust, performant, fault tolerant system for inference, then it's pretty much a no brainer.
Endpoints for Search
Pinecone is the vector database for intelligent search
Autoscaling endpoints for fast embeddings generation
- Different sentence transformers and embedding models
We were able to choose an off the shelf model that's very common for our customers to get started with and set it so that it can be configured to handle over 100 requests per second just with a few button clicks. With the release of the Hugging Face Inference Endpoints, we believe there's a new standard for how easy it can be to go build your first vector embedding based solution, whether it be semantic search or question answering system.
Endpoints for Videos
Waymark is a AI-powered video creator
Multi-modal endpoints for embeddings, audio and image generation
- sentence-transformers/all-mpnet-base-v2
- google/vit-base-patch16-224-in21k
- Custom model based on florentgbelidji/blip_captioning
You're bringing the potential time delta between - I've never seen anything that could do this before - to - I could have it on infrastructure ready to support an existing product - down to potentially less than a day.
Pricing
Choose a plan that fits your needs
Self-Serve
Pay as you go when using Inference Endpoints
- Pay for what you use, per minute
- Starting as low as $0.06/hour
- Billed monthly
- Email support
Enterprise
Get a custom quote and premium support
- Lower marginal costs based on volume
- Uptime guarantees
- Custom annual contracts
- Dedicated support, SLAs