LlamaIndex is an open-source data framework for connecting custom data sources to large language models. It provides tools for data ingestion, indexing, retrieval, and query engines — enabling developers to build RAG (retrieval-augmented generation) applications, agents, and LLM workflows over private data.
Best-in-class for RAG applications — data connectors, indexing strategies, query engines, and retrieval primitives are purpose-built for LLM data access.
Python-first. Well-documented but requires LLM application development experience to use effectively.
160+ data connectors, 30+ LLM providers, 40+ vector stores. Broad ecosystem coverage.
Fully open-source and free. LlamaCloud managed service for those who want hosted indexing and retrieval.
Purpose-built for LLM applications — advanced retrieval strategies (HyDE, ReRank, FLARE), multi-modal support, and agentic workflows.
Large and active community, excellent documentation, regular office hours, and a thriving Discord.
Self-hosted scales with your infrastructure. LlamaCloud provides managed scalability for production deployments.