So, what exactly is a MoE? In the context of transformer models, a MoE consists of two main elements:
- Sparse MoE layers
- A gate network or router
Source: Mixture of Experts Explained
A searchable collection of notes
So, what exactly is a MoE? In the context of transformer models, a MoE consists of two main elements:
- Sparse MoE layers
- A gate network or router
Source: Mixture of Experts Explained