Posts
LLaMA3 suffers highly from quantization
Read more on Substack Medium
Thu May 16 2024
Sparse Llama by Neural Magic, Cerebras and IST Austria!
Read more on Substack Medium
Thu May 16 2024
Understanding OpenAI's CLIP architecture
Read more on Substack
Tue May 14 2024
GPTQ: LLM Quantization
Read more on Substack
Wed May 07 2024
SmoothQuant: Smoothing systematic outliers in LLMs for efficient quantization
Read more on Substack
Wed May 08 2024
DeepMind's Speculative Sampling
Read more on Substack
Mon Feb 12 2024
Attention is all you need to understand!
Read more on Substack
Mon Feb 12 2024
The Best Things
collection of all the best thingsRead More →