Apple recently announced the release of OpenELM, a highly efficient language model, at a pre-WWDC24 event hosted on the Hugging Face platform.
OpenELM is distinguished by its open-source training and inference framework, aimed at promoting open research and ensuring the credibility of results.
This model utilizes a tiered scaling strategy, enabling precise parameter allocation across the layers of a Transformer model, which significantly boosts accuracy.
With approximately one billion parameters, OpenELM achieves a 2.36% higher accuracy than OLMo, while requiring only half the pre-training tokens.
Unlike previous models that were only available with model weights and inference code on private datasets, OpenELM is fully available with a framework for training and evaluating on public datasets.
This includes training logs, multiple checkpoints, and pre-training configurations.
Apple has also released code to convert the model for use in the MLX library, allowing inference and fine-tuning on Apple devices.
This comprehensive release is designed to strengthen the open research community and pave the way for future open research initiatives.
Source: OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework
Keep visiting for more such awesome posts, internet tips, lifestyle tips, and remember we cover,
“Everything under the Sun!”
Follow Inspire2rise on Twitter. | Follow Inspire2rise on Facebook. | Follow Inspire2rise on YouTube