Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

如何在910B的麒麟操作系统上进行多lora的推理部署? #311

Open
Jayc-Z opened this issue Oct 29, 2024 · 3 comments
Open

如何在910B的麒麟操作系统上进行多lora的推理部署? #311

Jayc-Z opened this issue Oct 29, 2024 · 3 comments

Comments

@Jayc-Z
Copy link

Jayc-Z commented Oct 29, 2024

Environment

Hardware Environment(Ascend/GPU/CPU):

Uncomment only one /device <> line, hit enter to put that in a new line, and remove leading whitespaces from that line:

/device ascend

/device gpu

/device cpu

Software Environment:

  • MindSpore version (source or binary):
  • Python version (e.g., Python 3.7.5):
  • OS platform and distribution (e.g., Linux Ubuntu 16.04):
  • GCC/Compiler version (if compiled from source):

Describe the current behavior

Describe the expected behavior

Steps to reproduce the issue

Related log / screenshot

Special notes for this issue

@longvoyage
Copy link
Contributor

issue提重复了。
此外标题中不需要指定特定的计算卡,指明Ascend平台就可以了。
多lora推理 是mindformers的特性,目前看dev有个相关提交关闭了,并没有合入。估计还在开发中。

@Jayc-Z
Copy link
Author

Jayc-Z commented Oct 30, 2024

issue提重复了。 此外标题中不需要指定特定的计算卡,指明Ascend平台就可以了。 多lora推理 是mindformers的特性,目前看dev有个相关提交关闭了,并没有合入。估计还在开发中。

谢谢你的回复。那如果想做到和vllm一样,单基座模型(Qwen1.5_14B)部署加载多个lora,是不是需要load_checkpoint多次?还是说只能将每个lora合并基座

@longvoyage
Copy link
Contributor

你可以看下这个实现,多lora的情况应该是把多lora合并到一份ckpt里面。
https://gitee.com/mindspore/mindformers/pulls/3541/files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants