Efficient Paper

Home

Paper List

By Year
By Keyword
By Publication
By Institution
By Author

Graph

Weekly Paper

2025-09-05
2025-09-15
2025-09-19
2025-09-26
2025-09-28
Lagency

Contributors

Efficient Paper

Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity

Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity

Key idea:

Load-as-Sparse and Compute-as-Dense

Built with MkDocs using a theme provided by Read the Docs.