Skip to content

Conversation

@KnowingNothing
Copy link
Contributor

In some scenarios such as speculative decoding, it is possible to pop all the tokens in the last block. This PR allows PopN to pop all the tokens in the last block of KV cache.

@tqchen tqchen merged commit 0aae97d into apache:main Apr 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants