[Bug]: Incorrect attribute reference in vllm/lora/layers.py #11707

zinccat · 2025-01-03T05:36:53Z

Your current environment

The output of `python collect_env.py`

Your output of `python collect_env.py` here

Model Input Dumps

No response

🐛 Describe the bug

self.output_dim should be changed to self.output_size

class ColumnParallelLinearWithLoRA(BaseLinearLayerWithLoRA):
    """
    LoRA on top of ColumnParallelLinear layer.
    LoRA B is sliced for tensor parallelism.
    There are two types for the `base_layer`:
    1. ColumnParallelLinear, e.g.`dense_h_to_4h` in `FalconForCausalLM`.
    2. MergedColumnParallelLinear, e.g.`gate_up_proj` in `Phi3ForCausalLM`.
    """

    def __init__(self, base_layer: ColumnParallelLinear) -> None:
        super().__init__(base_layer)
        # The base_layer type is ColumnParallelLinear or
        # MergedColumnParallelLinear, their weight sharding logic is
        # inconsistent when TP is greater than 1.
        self.is_merged_col_linear = type(
            base_layer) is MergedColumnParallelLinear
        self.tp_size = get_tensor_model_parallel_world_size()
        self.output_size = self.base_layer.output_size_per_partition
        # There is only one LoRA layer
        self.n_slices = 1

    def slice_lora_b(self, lora_b: torch.Tensor) -> torch.Tensor:
        # Applicable to cases where the base_layer is
        # MergedColumnParallelLinear.
        if self.is_merged_col_linear:
            tp_rank = get_tensor_model_parallel_rank()
            shard_size = self.output_size // 2
            offset = lora_b.shape[-1] // 2

            left_weight = lora_b[:, tp_rank * shard_size:(tp_rank + 1) *
                                 shard_size]
            right_weight = lora_b[:, offset + tp_rank * shard_size:offset +
                                  (tp_rank + 1) * shard_size]
            lora_b = torch.cat([left_weight, right_weight], dim=1)
        # Applicable to cases where the base_layer is
        # ColumnParallelLinear.
        else:
            tensor_model_parallel_rank = get_tensor_model_parallel_rank()
            shard_size = self.output_dim # self.output_dim is not defined
            start_idx = tensor_model_parallel_rank * shard_size
            end_idx = (tensor_model_parallel_rank + 1) * shard_size
            lora_b = lora_b[:, start_idx:end_idx]
        return lora_b

    def slice_bias(self, bias: torch.Tensor) -> torch.Tensor:
        # TODO: Fix the slicing logic of bias.
        if bias is None:
            return bias
        tensor_model_parallel_rank = get_tensor_model_parallel_rank()
        shard_size = self.output_dim # self.output_dim is not defined
        start_idx = tensor_model_parallel_rank * shard_size
        end_idx = (tensor_model_parallel_rank + 1) * shard_size
        bias = bias[start_idx:end_idx]
        return bias

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

The text was updated successfully, but these errors were encountered:

zinccat added the bug Something isn't working label Jan 3, 2025

zinccat mentioned this issue Jan 3, 2025

[Bugfix] Fix ColumnParallelLinearWithLoRA slice #11708

Merged

ywang96 closed this as completed in #11708 Jan 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Incorrect attribute reference in vllm/lora/layers.py #11707

[Bug]: Incorrect attribute reference in vllm/lora/layers.py #11707

zinccat commented Jan 3, 2025

[Bug]: Incorrect attribute reference in vllm/lora/layers.py #11707

[Bug]: Incorrect attribute reference in vllm/lora/layers.py #11707

Comments

zinccat commented Jan 3, 2025

Your current environment

Model Input Dumps

🐛 Describe the bug

Before submitting a new issue...