AI/ML

DeepSeek AI Versions Breakdown : Everything You Need to Know

Name: OneClick IT Consultancy P Limited
Address: 407-412, President Plaza Opp. Titanium Square Thaltej, Ahmedabad, Gujarat, 380054, India
Telephone: +1(802) 684-0486
Price range: $$$

Deepseek Model for your Business?

Cost Efficiency (Open Source)
Lower Long Term costs
Customised data control
Pre-trained model

Get Your Deepseek AI Model Running in a Day

Need technical help?

Our experts will get back to you within 24 hours.

Free Installation Guide - Step by Step Instructions Inside!

Overview

DeepSeek has developed multiple iterations of its large language models (LLMs) every iteration with its purpose and improvement. Here, we are going to describe the various versions, their description and github link.

DeepSeek Coder

Release Date: November 2023
Purpose: The first open-source model focused on programming tasks.
Description: DeepSeek Coder is an open source series of code language models built for increasing the code comprehension in software engineering. These models are built from ground up, trained on a dataset that is 87% code and 13 % natural language, amounting to 2 trillion tokens. The training data is in English and Chinese.
Github Repository: https://github.com/deepseek-ai/DeepSeek-Coder

DeepSeek LLM

Release Date: December 2023
Purpose: The first multi purpose model of DeepSeek.
Description: Presenting DeepSeek LLM, a state of the art language model with 67 billion parameters. It has been built from the ground up on a dataset with 2 trillion tokens in English and Chinese. We have also open sourced DeepSeek LLM 7B/67B Base and DeepSeek LLM 7B/67B Chat to encourage research.
Github Repository: https://github.com/deepseek-ai/DeepSeek-LLM

DeepSeek V2

Release Date: May 2024
Purpose: Not indicated, but it is expected to achieve more and cost less to train than its predecessor.
Description: DeepSeek V2 is a sophisticated open-source Mixture of Experts (MoE) model developed by DeepSeek AI. It is capable of delivering maximum output with minimal training inputs and makes inference efficient. The model consists of 236 billion parameters in total out of which 21 billion are activated per token in each processing.
Github Repository: https://github.com/deepseek-ai/DeepSeek-V2

DeepSeek Coder V2

Release Date: July 2024
Parameters: 236 billion
Context Window: 128,000 tokens
Purpose: For difficult programming challenges.
Description: DeepSeek Coder V2 is a high quality open source Mixture of Experts (MoE) coders language model developed by DeepSeek AI for the purpose of reaching GPT-4 Turbo level in all code oriented tasks.
Github Repository: https://github.com/deepseek-ai/DeepSeek-Coder-V2

DeepSeek V3

Release Date: December 2024
Parameters: 671 billion
Context Window: 128,000 tokens
Purpose: Mixture of experts, allowing for versatile task handling.
Description: DeepSeek V3 is the latest version of an open-source Mixture of Experts MoE language model held by DeepSeek AI. This model has been built with a focus on high performance while ensuring the training and inference processes are less tedious. DeepSeek V3 has 671 billion parameters in total and only 37 billion out of the 671 billion will be triggered with every token during processing.
Github Repository: https://github.com/deepseek-ai/DeepSeek-V3

DeepSeek R1

Release Date: January 2025
Parameters: 671 billion
Context Window: 128,000 tokens
Purpose: Advanced reasoning tasks, competing directly with OpenAI's models while being more cost effective.
Description: DeepSeek R1 is an open source reasoning model created by the Chinese AI company DeepSeek. It is aimed at a range of lower cost text tasks, such as logical inference, mathematics problem solving and decision making processes. With the help of this model, DeepSeek has developed a chatbot called DeepThink which positions them as competitors to ChatGPT.
Github Repository: https://github.com/deepseek-ai/DeepSeek-R1

Janus Pro 7B

Release Date: January 2025
Purpose: A vision model capable of understanding and generating images.

Description: JanusPro 7B is a state of the art open source multimodal AI model created by DeepSeek, which performs well in both image comprehension and text to image generation tasks. It has demonstrated better performance in the benchmarks compared to the likes of OpenAI’s DALLE3 and Stability AI’s Stable Diffusion.