German AI startup Aleph Alpha has introduced its latest foundation model family, Pharia-1-LLM, which includes the Pharia-1-LLM-7B-control and Pharia-1-LLM-7B-control-aligned models. These new AI models are available under the Open Aleph License, permitting non-commercial research and educational use.
Key Features of Pharia-1-LLM Models
- Pharia-1-LLM-7B-control: Designed for concise, length-controlled responses, this model is optimized for German, French, and Spanish. Trained on a multilingual corpus, it complies with EU and national regulations, including copyright and data privacy laws, making it suitable for domain-specific applications in industries like automotive and engineering.
- Pharia-1-LLM-7B-control-aligned: This variant adds safety features through alignment methods, making it ideal for use in conversational settings such as chatbots and virtual assistants, where safety and clarity are essential.
Training and Evaluation
The Pharia-1-LLM-7B model was trained in two phases: first on a 4.7 trillion token dataset using 256 A100 GPUs, and then on an additional 3 trillion tokens with a different data mix using 256 H100 GPUs. The training utilized mixed-precision strategies and optimization techniques to enhance performance.
These models were evaluated against similarly sized, multilingual counterparts, including Mistral’s Mistral-7B-Instruct-v0.3 and Meta’s llama-3.1-8b-instruct. The evaluations, detailed in the model card, showcase how Pharia-1-LLM-7B-control and its aligned version perform across multiple languages, highlighting areas where they excel or match their peers.
Aleph Alpha has provided a detailed account of the model architecture, hyperparameters, and training processes in a blog post, offering insights into their latest advancements in AI technology.