Poland has launched a new open-source language model called Bielik, which was trained using Polish text data on the Helios and Athena supercomputers at AGH University in Krakow.
Polish Context and Expertise
Bielik is designed to outperform foreign language models in handling Polish language and cultural contexts. Developed by the SpeakLeash Foundation and Cyfronet AGH, this Large Language Model (LLM) has 11 billion parameters, making it a powerful tool for generating text in Polish.
Training and Capabilities
The model’s training utilized supercomputing power for optimization, scaling, and synthetic data generation. The result is a robust model ranking high on the Polish OpenLLM Leaderboard.
Applications and Future Potential
Bielik’s open-source nature allows its use in specialized fields like legal and medical, ensuring Polish data remains secure. This innovation strengthens Poland’s position in AI and provides a local solution independent of international models.