Silo’s Poro Multilingual Model Challenges Big Tech In European AI

As generative AI becomes a Silicon Valley-dominated landscape, with tech giants like Microsoft, Google, Amazon, and Meta leading the charge, a glimmer of hope emerges from Europe. Today, Finnish company Silo introduces Poro, a multilingual language model that hints at a promising future for European AI, especially beyond the confines of the English language. In a realm dominated by Big Tech’s computational prowess, Silo’s approach, leveraging multilingual data and EU-funded initiatives, suggests that Europe may have a competitive edge in the evolving field of generative AI.

Data, the European Advantage: While Big Tech’s AI dominance often hinges on massive hardware resources, Silo emphasizes the importance of abundant language data, a realm where Europe could hold a distinct advantage. Silo’s Poro, a proof of concept model trained on Finnish and English text, challenges the status quo, demonstrating competitiveness with Meta’s open source Llama models. This achievement is fueled by Silo’s collaboration with the University of Turku and access to the High Performance Language Technologies (HPLT) project, a goldmine of 7 petabytes of language data across 80 languages.

Cross-Training for Multilingual Mastery: Facing the challenge of limited data for languages like Finnish, Silo employs a cross-training approach. By exposing Poro to both English and Finnish data, the model learns the intricate relationships between the two languages. This means Poro can generate responses in Finnish, even drawing on English training data. Silo’s commitment to open-source its cross-training techniques aims to enable the creation of models across all European languages, even in cases where data availability is scarce.

Sovereignty and Supercomputing: Peter Sarlin, Silo’s co-founder and CEO, emphasizes the need for Europe to bridge the opportunity gap in the market for Language Models (LLMs) in non-English languages. He advocates for European businesses to avoid heavy reliance on Big Tech-owned technology to retain value creation within the continent. Poro’s training on LUMI, an EU-funded supercomputer equipped with AMD chips, signals a shift towards utilizing European resources. Sarlin highlights the intention to open source software developed for AI training on LUMI, fostering collaboration and empowerment for other companies in the region.

Looking Ahead: Silo’s Poro stands as a testament to the potential for European AI to make significant strides, particularly in languages where data scarcity poses a challenge for Big Tech. As Silo plans to extend its model training across all European languages, leveraging collaborative efforts and supercomputing resources, the landscape of generative AI in Europe may see a transformative shift. This development not only challenges Big Tech dominance but also emphasizes the importance of sovereignty and collaboration in shaping the future of AI on the continent.

Conclusion: With Poro’s early success and Silo’s commitment to advancing European AI capabilities, the continent holds a promising position in the evolving AI landscape. As data accessibility and multilingual approaches become key factors, Silo’s journey with Poro underscores the potential for a European renaissance in generative AI. The collaboration between industry and academia, coupled with strategic use of supercomputing resources, hints at a future where European innovation plays a significant role in shaping the trajectory of AI development.