HLLM Wiki

Welcome!

HLLM is Harden LLM, a project to harden large language models for extreme inferencing speed optimization. This wiki is a collection of notes and resources for the HLLM project. It is a work in progress and will be updated regularly.

Why do we need a wiki?

We need a wiki to document the project and to share the knowledge with the community. This will help us to stay organized and to make the project more accessible to the community.

Main topics

The motivation of the HLLM
The design of the HLLM
The implementation of the HLLM
Some development notes and tips