人気の記事一覧

Characterizing the Accuracy - Efficiency Trade-off of Low-rank Decomposition in Language Models

1か月前

Compute Better Spent: Replacing Dense Layers with Structured Matrices

2週間前

MABViT -- Modified Attention Block Enhances Vision Transformers

2か月前

Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity

7か月前