Deploying OpenThinker 7B on AWS allows for scalable, high availability hosting of the model. AWS provides various services such as Amazon ECS (Elastic Container Service), Amazon EC2 (Elastic Compute Cloud) and AWS Lambda, which can be used for deploying LLMs.
In this guide, we will focus on deploying OpenThinker 7B on AWS using Amazon ECS (Fargate), which allows for serverless containerized deployment.
Scalability : Can handle high-demand traffic
Cost-effectiveness : Pay only for compute usage
Managed Infrastructure : No need to manually manage servers
Security : AWS IAM and VPC ensure secure access
Before starting, ensure you have:
Create an ECR Repository
Authenticate Docker with AWS ECR
Run the following command to log in to ECR (replace <aws_account_id> and <region> with your actual values):
aws ecr get-login-password --region <region> | docker login --username
AWS --password-stdin
<aws_account_id>.dkr.ecr.<region>.amazonaws.com
Tag the Docker Image Retrieve your ECR repository URI from AWS ECR (e.g., 123456789012.dkr.ecr.us-east-1.amazonaws.com/openthinker-7b) and tag your image:
docker tag openthinker-7b:latest
<aws_account_id>.dkr.ecr.<region>.amazonaws.com/openthinker-7b:latest
Push the Image to ECRdocker push <aws_account_id>.dkr.ecr.<region>.amazonaws.com/openthinker-7b:latest
Once completed, the image will be stored in AWS ECR and can be used for ECS deployment.We will use AWS Fargate to run the container without managing EC2 instances.
Check Running Tasks
Go to Amazon ECS → Select openthinker-cluster → Click Tasks
Make sure the task status is RUNNING.
Get the Public IP Address
If using a public subnet, navigate to:
Run the following command to test the model:
curl http://<public-ip>:11434
Expected output:{"message": "Model is up and running"}
To handle high traffic, increase the number of tasks:
AWS Fargate Auto Scaling can also be enabled for automatic scaling.
To avoid unnecessary charges, delete the ECS resources when not in use:
aws ecs delete-service --cluster openthinker-cluster --service
openthinker-7b-service --force
aws ecs delete-cluster --cluster openthinker-cluster
aws ecr delete-repository --repository-name openthinker-7b --force
Deploying OpenThinker 7B on AWS using ECS Fargate provides a fully managed, serverless environment with minimal setup and maintenance. By leveraging AWS ECR, ECS and Fargate, you can run large language models efficiently without managing underlying infrastructure.