如何在910B的麒麟操作系统上进行多lora的推理部署？ #311

Jayc-Z · 2024-10-29T09:22:45Z

Environment

Hardware Environment(`Ascend`/`GPU`/`CPU`):

Uncomment only one /device <> line, hit enter to put that in a new line, and remove leading whitespaces from that line:

/device ascend

/device gpu

/device cpu

Software Environment:

MindSpore version (source or binary):
Python version (e.g., Python 3.7.5):
OS platform and distribution (e.g., Linux Ubuntu 16.04):
GCC/Compiler version (if compiled from source):

Describe the current behavior

Describe the expected behavior

Steps to reproduce the issue

Related log / screenshot

Special notes for this issue

The text was updated successfully, but these errors were encountered:

longvoyage · 2024-10-30T08:05:16Z

issue提重复了。
此外标题中不需要指定特定的计算卡，指明Ascend平台就可以了。
多lora推理是mindformers的特性，目前看dev有个相关提交关闭了，并没有合入。估计还在开发中。

Jayc-Z · 2024-10-30T08:11:19Z

issue提重复了。此外标题中不需要指定特定的计算卡，指明Ascend平台就可以了。多lora推理是mindformers的特性，目前看dev有个相关提交关闭了，并没有合入。估计还在开发中。

谢谢你的回复。那如果想做到和vllm一样，单基座模型（Qwen1.5_14B）部署加载多个lora，是不是需要load_checkpoint多次？还是说只能将每个lora合并基座

longvoyage · 2024-10-30T08:21:29Z

你可以看下这个实现，多lora的情况应该是把多lora合并到一份ckpt里面。
https://gitee.com/mindspore/mindformers/pulls/3541/files

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

如何在910B的麒麟操作系统上进行多lora的推理部署？ #311

如何在910B的麒麟操作系统上进行多lora的推理部署？ #311

Jayc-Z commented Oct 29, 2024

longvoyage commented Oct 30, 2024

Jayc-Z commented Oct 30, 2024

longvoyage commented Oct 30, 2024

如何在910B的麒麟操作系统上进行多lora的推理部署？ #311

如何在910B的麒麟操作系统上进行多lora的推理部署？ #311

Comments

Jayc-Z commented Oct 29, 2024

Environment

Hardware Environment(Ascend/GPU/CPU):

Software Environment:

Describe the current behavior

Describe the expected behavior

Steps to reproduce the issue

Related log / screenshot

Special notes for this issue

longvoyage commented Oct 30, 2024

Jayc-Z commented Oct 30, 2024

longvoyage commented Oct 30, 2024

Hardware Environment(`Ascend`/`GPU`/`CPU`):