人気の記事一覧
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
FedPEAT: Convergence of Federated Learning, Parameter-Efficient Fine Tuning, and Emulator Assisted Tuning for Artificial Intelligence Foundation Models with Mobile Edge Computing
FP8-LM: Training FP8 Large Language Models
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
8-bit Optimizers via Block-wise Quantization