Welcome!
HLLM is Harden LLM, a project to harden large language models for extreme inferencing speed optimization. This wiki is a collection of notes and resources for the HLLM project. It is a work in progress and will be updated regularly.
Why do we need a wiki?
We need a wiki to document the project and to share the knowledge with the community. This will help us to stay organized and to make the project more accessible to the community.
Main topics
- The motivation of the HLLM
- The design of the HLLM
- The implementation of the HLLM
- Some development notes and tips