• Irdial@lemmy.sdf.org
      link
      fedilink
      English
      arrow-up
      27
      ·
      2 days ago

      the Chinese AI lab also released a smaller, “distilled” version of its new R1, DeepSeek-R1-0528-Qwen3-8B, that DeepSeek claims beats comparably sized models on certain benchmarks

      Most models come in 1B, 7-8B, 12-14B, and 27+B parameter variants. According to the docs, they benchmarked the 8B model using an NVIDIA H20 (96 GB VRAM) and got between 144-1198 tokens/sec. Most consumer GPUs probably aren’t going to be able to keep up with

      • brucethemoose@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        2 days ago

        Depends on the quantization.

        7B is small enough to run it in FP8 or a Marlin quant with SGLang/VLLM/TensorRT, so you can probably get very close to the H20 on a 3090 or 4090 (or even a 3060) and you know a little Docker.

      • Avid Amoeba@lemmy.ca
        link
        fedilink
        English
        arrow-up
        7
        ·
        2 days ago

        It proved sqrt(2) irrational with 40tps on a 3090 here. The 32b R1 did it with 32tps but it thought a lot longer.

        • Irdial@lemmy.sdf.org
          link
          fedilink
          English
          arrow-up
          3
          arrow-down
          1
          ·
          edit-2
          2 days ago

          On my Mac mini running LM Studio, it managed 1702 tokens at 17.19 tok/sec and thought for 1 minute. If accurate, high-performance models were more able to run on consumer hardware, I would use my 3060 as a dedicated inference device

    • LainTrain@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      7
      arrow-down
      1
      ·
      2 days ago

      I’m genuinely curious what you do that a 7b model is “trash” to you? Like yeah sure a gippity now tends to beat out a mistral 7b but I’m pretty happy with my mistral most of the time if I ever even need ai at all.

    • TropicalDingdong@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      2
      ·
      2 days ago

      Yeah idk. I did some work with deepseek early on. I wasn’t impressed.

      HOWEVER…

      Some other things they’ve developed like deepsite, holy shit impressive.