LM

Large Model（大模型）#

cv 大模型#

facebookresearch/segment-anything
UX-Decoder/Segment-Everything-Everywhere-All-At-Once
- SEEM: Segment Everything Everywhere All at Once
baaivision/Painter
- Painter & SegGPT Series: Vision Foundation Models from BAAI

大模型训练框架#

大模型推理框架#

sgl-project/sglang
vllm-project/vllm
huggingface/text-generation-inference
- TGI
InternLM/lmdeploy
ggerganov/llama.cpp
ollama/ollama
gpustack/gpustack
NVIDIA/TensorRT-LLM
- NVIDIA/FasterTransformer
  - 废弃了，改用 TensorRT-LLM
deepspeedai/DeepSpeed-MII
microsoft/onnxruntime
ModelTC/lightllm

教程#

rasbt/LLMs-from-scratch

调度#

ray-project/ray
kubeflow/kubeflow
mlflow/mlflow
sql-machine-learning/sqlflow
- Brings SQL and AI together.

加速 Pandas#

部署模型推理#

大模型训练#

目前训练超大规模语言模型主要有两条技术路线：

TPU + XLA + TensorFlow/JAX
- tensorflow/tensorflow
- openxla/xla
- google/jax
- 由 Google 主导，由于 TPU 和自家云平台 GCP 深度绑定，对于非 Googler 来说，只可远观而不可把玩，
GPU + PyTorch + Megatron-LM + DeepSpeed
- pytorch/pytorch
- NVIDIA/Megatron-LM
- microsoft/DeepSpeed
- 背后则有 NVIDIA、Meta、MS 大厂加持，社区氛围活跃，也更受到群众欢迎
horovod/horovod
bytedance/byteps