【异常】vLLM 分布式集群加载 HuggingFace 模型超时‘timed out‘ thrown while requesting HEAD https://huggingface.co/Qwe

发布时间：2026/6/26 20:18:04

【异常】vLLM 分布式集群加载 HuggingFace 模型超时‘timed out‘ thrown while requesting HEAD https://huggingface.co/Qwe

vLLM 分布式集群加载 HuggingFace 模型超时问题排查与解决方案一、报错内容在基于 Ray 部署 vLLM 分布式推理服务的场景下，执行模型启动命令后，服务无法完成模型加载，日志反复出现配置文件请求超时并重试的现象。启动命令示例：user@workstation:~/vllm-cluster$ ./launch-cluster.shexecvllm serve\Qwen/Qwen2-7B-Instruct\--host0.0.0.0\--port8000\