人気の記事一覧
Are Protein Language Models Compute Optimal?
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
Sakuga-42M Dataset: Scaling Up Cartoon Research
Pretraining on the Test Set Is All You Need
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations
Scaling MLPs: A Tale of Inductive Bias
Observational Scaling Laws and the Predictability of Language Model Performance
Scaling MLPs: A Tale of Inductive Bias
Scaling Laws for Transfer
Grandmaster-Level Chess Without Search
The Quantization Model of Neural Scaling