Introduction
This guide is designed specifically for developers who want to build with our models and datasets. You'll find practical resources for integrating our models into applications, from quick prototyping to production deployment, as well as setup instructions, hosting options, integration examples, and optimization techniques.
Our technology
All of our flagship models are covered in this guide, including OLMo 2, the best fully open language model to date, Molmo, our family of open state-of-the-art multimodal AI models, Tülu, a leading instruction-following model family, showcasing state-of-the-art open post-training techniques, and others.
The foundation of all our models is our fully open datasets. This includes Dolma, our large-scale, open-source English corpus containing 3 trillion tokens across over 4 billion documents, and Pixmo, a comprehensive multimodal dataset designed to enhance vision-language understanding. This guide will help you find the right dataset for your project.
Truly open
Ai2's mission is to advance AI through true openness. We go beyond just releasing model weights - we provide our training code, training data, our model weights, and our recipes. We hope that by sharing our full model pipelines freely and openly, we’re empowering the community to build the next generation of AI tools and applications. Learn more about our approach.