Comparison with LookAhead #2

RonanKMcGovern · 2023-12-20T13:58:15Z

This is a cool project.

I guess you're using the prompt for look ahead, but could also pull in some future guess tokens as well into the ngram look up table. Maybe as LookaheadDecoding is doing?

I was also thinking that it should be possible to use an LLM to predict forward tokens just by passing blank (zero embedding vectors) for a few positions ahead. See more here

apoorvumang · 2023-12-20T15:56:31Z

Thank you for the compliment!

I guess you're using the prompt for look ahead, but could also pull in some future guess tokens as well into the ngram look up table. Maybe as LookaheadDecoding is doing?

TBH I don't yet understand lookahead decoding completely so can't comment here

I was also thinking that it should be possible to use an LLM to predict forward tokens just by passing blank (zero embedding vectors) for a few positions ahead. hao-ai-lab/LookaheadDecoding#37

Are you suggesting this for the draft model or main model? This might help in making draft tokens faster, but I feel this won't give good results since prev token is probably very important when predicting the next token. Medusa requires some training to be able to do this https://github.com/FasterDecoding/Medusa

RonanKMcGovern · 2023-12-20T16:11:35Z

This might help in making draft tokens faster, but I feel this won't give good results since prev token is probably very important when predicting the next token.

Yeah, I'm not sure it would work, but may be worth a try. I think that guessing the previous token randomly is pretty bad because token prediction depends so much on the previous one. However, if a null embedding (and/or attention mask) is placed on token i, there may be some way of getting a reasonable estimate of token i+1. But yea, the prediction may still be too bad.

Medusa is a cool concept, but it's really annoying to have to train the in-built draft model.

apoorvumang · 2023-12-20T16:38:35Z

If someone can figure out a 'training free Medusa', that's probably a million dollar idea 😸

riyaj8888 · 2024-01-23T06:55:59Z

AttributeError: 'MistralForCausalLM' object has no attribute '_extend_attention_mask'

allenlu2009 mentioned this issue Aug 19, 2024

AttributeError: 'MistralForCausalLM' object has no attribute '_extend_attention_mask' #7

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comparison with LookAhead #2

Comparison with LookAhead #2

RonanKMcGovern commented Dec 20, 2023

apoorvumang commented Dec 20, 2023

RonanKMcGovern commented Dec 20, 2023

apoorvumang commented Dec 20, 2023

riyaj8888 commented Jan 23, 2024

Comparison with LookAhead #2

Comparison with LookAhead #2

Comments

RonanKMcGovern commented Dec 20, 2023

apoorvumang commented Dec 20, 2023

RonanKMcGovern commented Dec 20, 2023

apoorvumang commented Dec 20, 2023

riyaj8888 commented Jan 23, 2024