Immediately after switching the page, it will work with CSR.
Please reload your browser to see how it works.

Source:https://github.com/SoraKumo001/next-streaming

⬅️ The path to open-sourcing the DeepSeek inference engine

ozgune 6 daysReload

In March, vLLM picked up some of the improvements in the DeepSeek paper. Through these, vLLM v0.7.3's DeepSeek performance jumped to about 3x+ of what it was before [1].

What's exciting is that there's still so much room for improvement. We benchmark around 5K total tokens/s with the sharegpt dataset and 12K total token/s with random 2000/100, using vLLM and under high concurrency.

DeepSeek-V3/R1 Inference System Overview [2] quotes "Each H800 node delivers an average throughput of 73.7k tokens/s input (including cache hits) during prefilling or 14.8k tokens/s output during decoding."

Yes, DeepSeek deploys a different inference architecture. But this goes onto show just how much room there is for improvement. Looking forward to more open source!

[1] https://developers.redhat.com/articles/2025/03/19/how-we-opt...

[2] https://github.com/deepseek-ai/open-infra-index/blob/main/20...

vintagedave 6 daysReload

I really empathised with this part:

> Codebase Divergence: Our engine is based on an early fork of vLLM from over a year ago. Although structurally similar, we’ve heavily customized it for DeepSeek models, making it difficult to extend for broader use cases.

I've been there. Probably a few of us have.

Their approach of working on splitting out maintainable sublibraries and sharing info directly even if not integrated seems a really nice way of working with the community -- ie, they have obstacles, but they're not letting the obstacles cause them to take the easy route of not contributing at all. And while it might seem better to someone wanting to use their techniques to share only working code, not info on the techniques, at least it's still knowledge sharing. And again I think it'd be easier for them not to do it. So kudos to them.

avodonosov 6 daysReload

What motivates the commercial AI companies to share their research results and know-how?

Why did Google published the Transformer architecture instead of keeping it to themselves?

I understand that people may want to do good things for humanity, facilitate progress, etc. But if an action goes against commercial interest, how can the company management take it and not get objections from shareholders?

Or there is a commercial logic that motivates sharing of information and intellectual property? What logic is that?

londons_explore 5 daysReload

"We have something that would be of interest to the open source community, but it needs a lot of tidying to even run outside our company, and we don't have the manpower to properly maintain it when released".

Plenty of companies are in this position.

Please just open source anyway with a note saying "we won't be maintaining this, but feel free to fork!"

oldgun 6 daysReload

Nice. We've seen some good engineering work from DeepSeek. Keep it coming.