DeepSeek-R1 fashions now out there on AWS

February 3, 2025

27

Throughout this previous AWS re:Invent, Amazon CEO Andy Jassy shared invaluable classes realized from Amazon’s personal expertise creating practically 1,000 generative AI purposes throughout the corporate. Drawing from this in depth scale of AI deployment, Jassy supplied three key observations which have formed Amazon’s method to enterprise AI implementation.

First is that as you get to scale in generative AI purposes, the price of compute actually issues. Individuals are very hungry for higher value efficiency. The second is definitely fairly tough to construct a very good generative AI software. The third is the variety of the fashions getting used once we gave our builders freedom to choose what they need to do. It doesn’t shock us, as a result of we continue learning the identical lesson over and time and again, which is that there’s by no means going to be one device to rule the world.

As Andy emphasised, a broad and deep vary of fashions supplied by Amazon empowers clients to decide on the exact capabilities that finest serve their distinctive wants. By intently monitoring each buyer wants and technological developments, AWS usually expands our curated choice of fashions to incorporate promising new fashions alongside established {industry} favorites. This ongoing enlargement of high-performing and differentiated mannequin choices helps clients keep on the forefront of AI innovation.

This leads us to Chinese language AI startup DeepSeek. DeepSeek launched DeepSeek-V3 on December 2024 and subsequently launched DeepSeek-R1, DeepSeek-R1-Zero with 671 billion parameters, and DeepSeek-R1-Distill fashions starting from 1.5–70 billion parameters on January 20, 2025. They added their vision-based Janus-Professional-7B mannequin on January 27, 2025. The fashions are publicly out there and are reportedly 90-95% extra reasonably priced and cost-effective than comparable fashions. Per Deepseek, their mannequin stands out for its reasoning capabilities, achieved via revolutionary coaching methods reminiscent of reinforcement studying.

Right now, now you can deploy DeepSeek-R1 fashions in Amazon Bedrock and Amazon SageMaker AI. Amazon Bedrock is finest for groups searching for to shortly combine pre-trained basis fashions via APIs. Amazon SageMaker AI is good for organizations that need superior customization, coaching, and deployment, with entry to the underlying infrastructure. Moreover, it’s also possible to use AWS Trainium and AWS Inferentia to deploy DeepSeek-R1-Distill fashions cost-effectively by way of Amazon Elastic Compute Cloud (Amazon EC2) or Amazon SageMaker AI.

With AWS, you need to use DeepSeek-R1 fashions to construct, experiment, and responsibly scale your generative AI concepts by utilizing this highly effective, cost-efficient mannequin with minimal infrastructure funding. You may as well confidently drive generative AI innovation by constructing on AWS providers which can be uniquely designed for safety. We extremely suggest integrating your deployments of the DeepSeek-R1 fashions with Amazon Bedrock Guardrails so as to add a layer of safety to your generative AI purposes, which can be utilized by each Amazon Bedrock and Amazon SageMaker AI clients.

You possibly can select deploy DeepSeek-R1 fashions on AWS immediately in a number of methods: 1/ Amazon Bedrock Market for the DeepSeek-R1 mannequin, 2/ Amazon SageMaker JumpStart for the DeepSeek-R1 mannequin, 3/ Amazon Bedrock Cust om Mannequin Import for the DeepSeek-R1-Distill fashions, and 4/ Amazon EC2 Trn1 situations for the DeepSeek-R1-Distill fashions.

Let me stroll you thru the assorted paths for getting began with DeepSeek-R1 fashions on AWS. Whether or not you’re constructing your first AI software or scaling current options, these strategies present versatile beginning factors based mostly in your group’s experience and necessities.

1. The DeepSeek-R1 mannequin in Amazon Bedrock Market
Amazon Bedrock Market provides over 100 in style, rising, and specialised FMs alongside the present choice of industry-leading fashions in Amazon Bedrock. You possibly can simply uncover fashions in a single catalog, subscribe to the mannequin, after which deploy the mannequin on managed endpoints.

To entry the DeepSeek-R1 mannequin in Amazon Bedrock Market, go to the Amazon Bedrock console and choose Mannequin catalog beneath the Basis fashions part. You possibly can shortly discover DeepSeek by looking out or filtering by mannequin suppliers.

After testing the mannequin element web page together with the mannequin’s capabilities, and implementation tips, you possibly can straight deploy the mannequin by offering an endpoint identify, selecting the variety of situations, and choosing an occasion kind.

You may as well configure superior choices that allow you to customise the safety and infrastructure settings for the DeepSeek-R1 mannequin together with VPC networking, service function permissions, and encryption settings. For manufacturing deployments, it’s best to assessment these settings to align along with your group’s safety and compliance necessities.

With Amazon Bedrock Guardrails, you possibly can independently consider person inputs and mannequin outputs. You possibly can management the interplay between customers and DeepSeek-R1 along with your outlined set of insurance policies by filtering undesirable and dangerous content material in generative AI purposes. The DeepSeek-R1 mannequin in Amazon Bedrock Market can solely be used with Bedrock’s ApplyGuardrail API to judge person inputs and mannequin responses for customized and third-party FMs out there exterior of Amazon Bedrock. To study extra, learn Implement model-independent security measures with Amazon Bedrock Guardrails.

Amazon Bedrock Guardrails will also be built-in with different Bedrock instruments together with Amazon Bedrock Brokers and Amazon Bedrock Data Bases to construct safer and safer generative AI purposes aligned with accountable AI insurance policies. To study extra, go to the AWS Accountable AI web page.

Up to date on 1st February – You should use the Bedrock playground for understanding how the mannequin responds to numerous inputs and letting you fine-tune your prompts for optimum outcomes.

When utilizing DeepSeek-R1 mannequin with the Bedrock’s playground or InvokeModel API, please use DeepSeek’s chat template for optimum outcomes. For instance, <｜begin_of_sentence｜><｜Person｜>content material for inference<｜Assistant｜>.

Seek advice from this step-by-step information on deploy the DeepSeek-R1 mannequin in Amazon Bedrock Market. To study extra, go to Deploy fashions in Amazon Bedrock Market.

2. The DeepSeek-R1 mannequin in Amazon SageMaker JumpStart
Amazon SageMaker JumpStart is a machine studying (ML) hub with FMs, built-in algorithms, and prebuilt ML options which you could deploy with only a few clicks. To deploy DeepSeek-R1 in SageMaker JumpStart, you possibly can uncover the DeepSeek-R1 mannequin in SageMaker Unified Studio, SageMaker Studio, SageMaker AI console, or programmatically via the SageMaker Python SDK.

Within the Amazon SageMaker AI console, open SageMaker Unified Studio or SageMaker Studio. In case of SageMaker Studio, select JumpStart and seek for “DeepSeek-R1” within the All public fashions web page.

You possibly can choose the mannequin and select deploy to create an endpoint with default settings. When the endpoint comes InService, you can also make inferences by sending requests to its endpoint.

You possibly can derive mannequin efficiency and ML operations controls with Amazon SageMaker AI options reminiscent of Amazon SageMaker Pipelines, Amazon SageMaker Debugger, or container logs. The mannequin is deployed in an AWS safe surroundings and beneath your digital personal cloud (VPC) controls, serving to to help knowledge safety.

As like Bedrock Marketpalce, you need to use the ApplyGuardrail API within the SageMaker JumpStart to decouple safeguards to your generative AI purposes from the DeepSeek-R1 mannequin. Now you can use guardrails with out invoking FMs, which opens the door to extra integration of standardized and completely examined enterprise safeguards to your software movement whatever the fashions used.

Seek advice from this step-by-step information on deploy the DeepSeek-R1 mannequin in Amazon SageMaker JumpStart. To study extra, go to Uncover SageMaker JumpStart fashions in SageMaker Unified Studio or Deploy SageMaker JumpStart fashions in SageMaker Studio.

3. DeepSeek-R1-Distill fashions utilizing Amazon Bedrock Customized Mannequin Import
Amazon Bedrock Customized Mannequin Import offers the power to import and use your custom-made fashions alongside current FMs via a single serverless, unified API with out the necessity to handle underlying infrastructure. With Amazon Bedrock Customized Mannequin Import, you possibly can import DeepSeek-R1-Distill fashions starting from 1.5–70 billion parameters. As I highlighted in my weblog publish about Amazon Bedrock Mannequin Distillation, the distillation course of includes coaching smaller, extra environment friendly fashions to imitate the conduct and reasoning patterns of the bigger DeepSeek-R1 mannequin with 671 billion parameters by utilizing it as a trainer mannequin.

After storing these publicly out there fashions in an Amazon Easy Storage Service (Amazon S3) bucket or an Amazon SageMaker Mannequin Registry, go to Imported fashions beneath Basis fashions within the Amazon Bedrock console and import and deploy them in a totally managed and serverless surroundings via Amazon Bedrock. This serverless method eliminates the necessity for infrastructure administration whereas offering enterprise-grade safety and scalability.

Up to date on 1st February – After importing the distilled mannequin, you need to use the Bedrock playground for understanding distilled mannequin responses to your inputs.

Watch a demo video made by my colleague Du’An Lightfoot for importing the mannequin and inference within the Bedrock playground.

Seek advice from this step-by-step information on deploy DeepSeek-R1-Distill fashions utilizing Amazon Bedrock Customized Mannequin Import. To study extra, go to Import a custom-made mannequin into Amazon Bedrock.

4. DeepSeek-R1-Distill fashions utilizing AWS Trainium and AWS Inferentia
AWS Deep Studying AMIs (DLAMI) offers custom-made machine pictures that you need to use for deep studying in a wide range of Amazon EC2 situations, from a small CPU-only occasion to the most recent high-powered multi-GPU situations. You possibly can deploy the DeepSeek-R1-Distill fashions on AWS Trainuim1 or AWS Inferentia2 situations to get the perfect price-performance.

To get began, go to Amazon EC2 console and launch a trn1.32xlarge EC2 occasion with the Neuron Multi Framework DLAMI referred to as Deep Studying AMI Neuron (Ubuntu 22.04).

Upon getting linked to your launched ec2 occasion, set up vLLM, an open-source device to serve Massive Language Fashions (LLMs) and obtain the DeepSeek-R1-Distill mannequin from Hugging Face. You possibly can deploy the mannequin utilizing vLLM and invoke the mannequin server.

To study extra, seek advice from this step-by-step information on deploy DeepSeek-R1-Distill Llama fashions on AWS Inferentia and Trainium.

You may as well go to DeepSeek-R1-Distill fashions playing cards on Hugging Face, reminiscent of DeepSeek-R1-Distill-Llama-8B or deepseek-ai/DeepSeek-R1-Distill-Llama-70B. Select Deploy after which Amazon SageMaker. From the AWS Inferentia and Trainium tab, copy the instance code for deploy DeepSeek-R1-Distill fashions.

Because the launch of DeepSeek-R1, varied guides of its deployment for Amazon EC2 and Amazon Elastic Kubernetes Service (Amazon EKS) have been posted. Right here is a few further materials so that you can try:

Issues to know
Listed here are a number of vital issues to know.

Pricing – For publicly out there fashions like DeepSeek-R1, you might be charged solely the infrastructure value based mostly on inference occasion hours you choose for Amazon Bedrock Markeplace, Amazon SageMaker JumpStart, and Amazon EC2. For the Bedrock Customized Mannequin Import, you might be solely charged for mannequin inference, based mostly on the variety of copies of your customized mannequin is energetic, billed in 5-minute home windows. To study extra, try the Amazon Bedrock Pricing, Amazon SageMaker AI Pricing, and Amazon EC2 Pricing pages.
Information safety – You should use enterprise-grade safety features in Amazon Bedrock and Amazon SageMaker that can assist you make your knowledge and purposes safe and personal. This implies your knowledge will not be shared with mannequin suppliers, and isn’t used to enhance the fashions. This is applicable to all fashions—proprietary and publicly out there—like DeepSeek-R1 fashions on Amazon Bedrock and Amazon SageMaker. To study extra, go to Amazon Bedrock Safety and Privateness and Safety in Amazon SageMaker AI.

Now out there
DeepSeek-R1 is usually out there immediately in Amazon Bedrock Market and Amazon SageMaker JumpStart in US East (N. Virginia), US East (Ohio), and US West (Oregon) AWS Areas. You may as well use DeepSeek-R1-Distill fashions utilizing Amazon Bedrock Customized Mannequin Import and Amazon EC2 situations with AWS Trainum and Inferentia chips.

Give DeepSeek-R1 fashions a attempt immediately within the Amazon Bedrock console, Amazon SageMaker AI console, and Amazon EC2 console, and ship suggestions to AWS re:Publish for Amazon Bedrock and AWS re:Publish for SageMaker AI or via your common AWS Help contacts.

— Channy

DeepSeek-R1 fashions now out there on AWS

Related Articles

Apple is the unique new broadcast accomplice for Formulation 1® within the U.S.

Collagen Has Anti-Getting older Properties. This is Why You Have to Add it to Your Food plan

Netflix goes ‘all in’ on generative AI as leisure trade stays divided

LEAVE A REPLY Cancel reply

Latest Articles

Apple is the unique new broadcast accomplice for Formulation 1® within the U.S.

Collagen Has Anti-Getting older Properties. This is Why You Have to Add it to Your Food plan

Netflix goes ‘all in’ on generative AI as leisure trade stays divided

The Java Developer’s Dilemma: Half 2 – O’Reilly

Stifel’s strategy to scalable Information Pipeline Orchestration in Information Mesh