DeepSeek Engram: Conditional Memory via Scalable Lookup [pdf]

(github.com)

4 points | by natrys 9 hours ago ago

1 comments

alyxya 4 hours ago ago
Unlike most improvements to LLMs that modify the architecture or optimizer or something about the model, this paper discusses a novel technique that relies on some external lookup table in the forward pass computation, with the external lookup happening in parallel with some of the compute. It's a really interesting idea with a lot of cool engineering work behind it, but it looks too convoluted without improvements that could justify the complexity.