HandH1998

HandH1998 HandH1998

Achievements

vllm-project/vllm vllm-project/vllm Public

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 30.9k 4.7k
bytedance/lightseq bytedance/lightseq Public

LightSeq: A High Performance Library for Sequence Processing and Generation

C++ 3.2k 329
microsoft/Megatron-DeepSpeed microsoft/Megatron-DeepSpeed Public

Forked from NVIDIA/Megatron-LM

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Python 1.9k 346
AniZpZ/AutoSmoothQuant AniZpZ/AutoSmoothQuant Public

An easy-to-use package for implementing SmoothQuant for LLMs

Python 84 7
QQQ QQQ Public

QQQ is an innovative and hardware-optimized W4A8 quantization solution for LLMs.

Python 91 8
IST-DASLab/marlin IST-DASLab/marlin Public

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Python 635 50