AI/ML

AWS ECS OpenThinker 7B Deployment - A Step by Step Guide


Introduction

Deploying OpenThinker 7B on AWS allows for scalable, high availability hosting of the model. AWS provides various services such as Amazon ECS (Elastic Container Service), Amazon EC2 (Elastic Compute Cloud) and AWS Lambda, which can be used for deploying LLMs.

In this guide, we will focus on deploying OpenThinker 7B on AWS using Amazon ECS (Fargate), which allows for serverless containerized deployment.

Key Benefits of Deploying OpenThinker 7B on AWS

Scalability : Can handle high-demand traffic

Cost-effectiveness : Pay only for compute usage

Managed Infrastructure : No need to manually manage servers

Security : AWS IAM and VPC ensure secure access

Step 1: Prerequisites

Before starting, ensure you have:

  • An AWS account
  • AWS Management Console access
  • Docker installed on your local machine
  • AWS CLI installed and configured (aws configure)
  • A pre-built Docker image of OpenThinker 7B (from previous steps)

Step 2: Push the Docker Image to Amazon ECR (Elastic Container Registry)

Create an ECR Repository

  1. Open the AWS Management Console
  2. Go to Amazon ECR → Click Create repository
  3. Enter Repository name (e.g., openthinker-7b)
  4. Select Private repository
  5. Click Create repository

Authenticate Docker with AWS ECR

Run the following command to log in to ECR (replace <aws_account_id> and <region> with your actual values):

aws ecr get-login-password --region <region> | docker login --username AWS --password-stdin <aws_account_id>.dkr.ecr.<region>.amazonaws.com

Tag the Docker Image Retrieve your ECR repository URI from AWS ECR (e.g., 123456789012.dkr.ecr.us-east-1.amazonaws.com/openthinker-7b) and tag your image:

docker tag openthinker-7b:latest <aws_account_id>.dkr.ecr.<region>.amazonaws.com/openthinker-7b:latest Push the Image to ECRdocker push <aws_account_id>.dkr.ecr.<region>.amazonaws.com/openthinker-7b:latest Once completed, the image will be stored in AWS ECR and can be used for ECS deployment.

Step 3: Create an ECS Cluster

We will use AWS Fargate to run the container without managing EC2 instances.

  1. Go to Amazon ECS → Click Create cluster
  2. Choose Networking only (AWS Fargate)
  3. Enter Cluster name (e.g., openthinker-cluster)
  4. Click Create

Step 4: Create a Task Definition for OpenThinker 7B

  1. Go to Amazon ECS → Click Task Definitions
  2. Click Create new task definition
  3. Select Fargate as the launch type
  4. Enter Task definition name (e.g., openthinker-7b-task)
  5. Set Task size:
    1. vCPU: 2 vCPUs
    2. Memory: 8GB RAM
  6. Click Add container and configure:
    1. Container name: openthinker-7b-container
    2. Image: Paste the ECR image URL from Step 2
    3. Port mappings: 11434 (same as the Docker container)
  7. Click Create

Step 5: Create an ECS Service

  1. Go to Amazon ECS → Click ServicesCreate
  2. Select Launch Type: Fargate
  3. Choose Cluster: openthinker-cluster
  4. Select Task definition: openthinker-7b-task
  5. Choose Service Name: openthinker-7b-service
  6. Set the Number of tasks to 1 (or more for scaling)
  7. Select Networking:
    • VPC: Choose an existing or new VPC
    • Subnets: Select public subnets
    • Security Group: Allow port 11434 inbound
  8. Click Deploy

Step 6: Verify Deployment

Check Running Tasks

Go to Amazon ECS → Select openthinker-cluster → Click Tasks

Make sure the task status is RUNNING.

Get the Public IP Address

If using a public subnet, navigate to:

  • ECS Service → Select Running Task
  • Look for Public IP

Run the following command to test the model:

curl http://<public-ip>:11434 Expected output:{"message": "Model is up and running"}

Step 7: Scaling the Model (Optional)

To handle high traffic, increase the number of tasks:

  1. Go to ECS Service
  2. Select openthinker-7b-service
  3. Click Update → Increase Desired Task Count
  4. Save and Deploy

AWS Fargate Auto Scaling can also be enabled for automatic scaling.

 

Step 8: Cleaning Up Resources (If Needed)

To avoid unnecessary charges, delete the ECS resources when not in use: aws ecs delete-service --cluster openthinker-cluster --service

openthinker-7b-service --forceaws ecs delete-cluster --cluster openthinker-clusteraws ecr delete-repository --repository-name openthinker-7b --force 

Conclusion

Deploying OpenThinker 7B on AWS using ECS Fargate provides a fully managed, serverless environment with minimal setup and maintenance. By leveraging AWS ECR, ECS and Fargate, you can run large language models efficiently without managing underlying infrastructure.

0

AI/ML

Related Center Of Excellence