AI/ML

DeepSeek AI Versions Breakdown : Everything You Need to Know

 

deepseek
Deepseek Model for your Business?
  • check icon

    Cost Efficiency (Open Source)

  • check icon

    Lower Long Term costs

  • check icon

    Customised data control

  • check icon

    Pre-trained model

Read More

Get Your Deepseek AI Model Running in a Day


Free Installation Guide - Step by Step Instructions Inside!

Overview

DeepSeek has developed multiple iterations of its large language models (LLMs) every iteration with its purpose and improvement. Here, we are going to describe the various versions, their description and github link.

DeepSeek Coder

  • Release Date: November 2023

  • Purpose: The first open-source model focused on programming tasks.

  • Description: DeepSeek Coder is an open source series of code language models built for increasing the code comprehension in software engineering. These models are built from ground up, trained on a dataset that is 87% code and 13 % natural language, amounting to 2 trillion tokens. The training data is in English and Chinese.

  • Github Repository: https://github.com/deepseek-ai/DeepSeek-Coder 

DeepSeek LLM

  • Release Date: December 2023

  • Purpose: The first multi purpose model of DeepSeek. 

  • Description: Presenting DeepSeek LLM, a state of the art language model with 67 billion parameters. It has been built from the ground up on a dataset with 2 trillion tokens in English and Chinese. We have also open sourced DeepSeek LLM 7B/67B Base and DeepSeek LLM 7B/67B Chat to encourage research.

  • Github Repository: https://github.com/deepseek-ai/DeepSeek-LLM 

DeepSeek V2

  • Release Date: May 2024

  • Purpose: Not indicated, but it is expected to achieve more and cost less to train than its predecessor. 

  • Description: DeepSeek V2 is a sophisticated open-source Mixture of Experts (MoE) model developed by DeepSeek AI. It is capable of delivering maximum output with minimal training inputs and makes inference efficient. The model consists of 236 billion parameters in total out of which 21 billion are activated per token in each processing.

  • Github Repository: https://github.com/deepseek-ai/DeepSeek-V2 

DeepSeek Coder V2

  • Release Date: July 2024

  • Parameters: 236 billion

  • Context Window: 128,000 tokens

  • Purpose: For difficult programming challenges. 

  • Description: DeepSeek Coder V2 is a high quality open source Mixture of Experts (MoE) coders language model developed by DeepSeek AI for the purpose of reaching GPT-4 Turbo level in all code oriented tasks.

  • Github Repository: https://github.com/deepseek-ai/DeepSeek-Coder-V2 

DeepSeek V3

  • Release Date: December 2024

  • Parameters: 671 billion

  • Context Window: 128,000 tokens

  • Purpose: Mixture of experts, allowing for versatile task handling.

  • Description: DeepSeek V3 is the latest version of an open-source Mixture of Experts MoE language model held by DeepSeek AI. This model has been built with a focus on high performance while ensuring the training and inference processes are less tedious. DeepSeek V3 has 671 billion parameters in total and only 37 billion out of the 671 billion will be triggered with every token during processing.

  • Github Repository: https://github.com/deepseek-ai/DeepSeek-V3 

DeepSeek R1

  • Release Date: January 2025

  • Parameters: 671 billion

  • Context Window: 128,000 tokens

  • Purpose: Advanced reasoning tasks, competing directly with OpenAI's models while being more cost effective.

  • Description: DeepSeek R1 is an open source reasoning model created by the Chinese AI company DeepSeek. It is aimed at a range of lower cost text tasks, such as logical inference, mathematics problem solving and decision making processes. With the help of this model, DeepSeek has developed a chatbot called DeepThink which positions them as competitors to ChatGPT.

  • Github Repository: https://github.com/deepseek-ai/DeepSeek-R1 

Janus Pro 7B

  • Release Date: January 2025
  • Purpose: A vision model capable of understanding and generating images.
  • Description: JanusPro 7B is a state of the art open source multimodal AI model created by DeepSeek, which performs well in both image comprehension and text to image generation tasks. It has demonstrated better performance in the benchmarks compared to the likes of OpenAI’s DALLE3 and Stability AI’s Stable Diffusion.

 

Ready to transform your business with our technology solutions? Contact Us  today to Leverage Our AI/ML Expertise. 

0

AI/ML

Related Center Of Excellence