LLM
Pretrained Model
Decoding
Title | Year | Author | Link | Memo |
---|---|---|---|---|
Confident adaptive language modeling | 2022 neurips | Schuster, Tal, et al. | pdf, blog | early exit at early decoder layer |
Instruction Model
Evaluation
Title | Year | Author | Link | Memo |
---|---|---|---|---|
The Turking Test: Can language models understand instructions? | 2020 | Efrat, Avia, and Omer Levy. |
Dataset
check this repo
Title | Year | Author | Link | Memo |
---|---|---|---|---|
DialogStudio: Towards Richest and Most Diverse Unified Dataset Collection for Conversational AI | 2023 | Jianguo et al | pdf dataset | collection of dialogue datasets |
Decoding
Title | Year | Author | Link | Memo |
---|---|---|---|---|
Fast Inference from Transformers via Speculative Decoding | 2023 | Yaniv Leviathan et al | speculative decoding | |
Accelerating Transformer Inference for Translation via Parallel Decoding | 2023 | Andrea Santilli | parallel decoding |
Others
Title | Year | Author | Link | Memo |
---|---|---|---|---|
A Watermark for Large Language Models | 2023 ICML | Kirchenbauer, John, et al. | use prev token to generate hash to separate vocab into two parts and enforce generation using one part | |
Text Embeddings Reveal (Almost) As Much As Text | 2023 | John X. Morris | train a inversion model to take embedding (and previous text hyp) into text. It can almost recover the original text by improving iteratively. |