Redis for AI and search
An overview of Redis for AI and search documentation, including vector search, AI agents, and the Context Engine (Redis Iris) managed services.
Redis stores and indexes vector embeddings that semantically represent unstructured data including text passages, images, videos, or audio. Store vectors and the associated metadata within hashes or JSON documents for indexing and querying.
Redis Feature Form
Use Redis Feature Form to define, manage, and serve machine learning features on top of your existing data systems. The Feature Form docs cover the Python SDK workflow from provider registration through feature serving.
AI agents
AI agents are autonomous systems that combine LLMs with memory, tools, and planning to accomplish complex, multi-step tasks. Redis powers the core capabilities agents need: fast vector search, persistent memory, real-time data streaming, and structured access to business data.
- AI agent builder — Use the interactive code generator to create a working agent in your preferred language with your choice of LLM.
- How agents work — Learn the agent processing cycle, memory architecture, and why Redis is the foundation for production agents.
- Context Engine — The managed service suite that gives agents what they need: semantic caching, persistent memory, structured data access, and live data integration.
Context Engine services
The Context Engine (Redis Iris) includes four fully-managed services available on Redis Cloud:
- LangCache — Semantic caching that reduces LLM API costs and improves response times by reusing cached responses for similar queries.
- Agent Memory — Two-tier persistent memory (session and long-term) for agents, available as a REST API and Python SDK.
- Context Retriever — Turns your business data into structured, governed tools that agents can reliably use, defined once and reused across all agents.
- Data Integration — Keeps your Redis Cloud database in sync with relational databases in near real time using Change Data Capture.
How to's
- Create a vector index: Redis maintains a secondary index over your data with a defined schema (including vector fields and metadata). Redis supports
FLATandHNSWvector index types. - Store and update vectors: Redis stores vectors and metadata in hashes or JSON objects.
- Search with vectors: Redis supports several advanced querying strategies with vector fields including k-nearest neighbor (KNN), vector range queries, and metadata filters.
- Configure vector queries at runtime: Select the best filter mode to optimize query execution.
- Build an AI agent: Use the interactive agent builder to generate complete working code for conversational assistants and recommendation engines.
- Add semantic caching: Reduce LLM API calls by caching and reusing responses for semantically similar queries.
- Add agent memory: Give your agent persistent session and long-term memory using the Agent Memory REST API.
- Access structured business data: Use Context Retriever to define your business data as governed tools that any agent can query reliably.
- Sync live data to Redis: Use Data Integration to keep your Redis Cloud database in sync with your primary relational database using Change Data Capture.
Learn how to index and query vector embeddings
Concepts
Learn to perform vector search, build AI agents, and use semantic caching and memory in your AI/ML projects.
Quickstarts
Quickstarts or recipes are useful when you are trying to build specific functionality. For example, you might want to do RAG with LangChain or set up LLM memory for your AI agent.
Get started with these foundational guides:
RAG
Retrieval Augmented Generation (aka RAG) is a technique to enhance the ability of an LLM to respond to user queries. The retrieval part of RAG is supported by a vector database, which can return semantically relevant results to a user's query, serving as contextual information to augment the generative capabilities of an LLM.
Explore our AI notebooks collection for comprehensive RAG examples including:
- RAG implementations with RedisVL, LangChain, and LlamaIndex
- Advanced RAG techniques and optimizations
- RAG evaluation with the RAGAS framework
- Integration with cloud platforms like Azure and Vertex AI
Additional resources:
Agents
AI agents can act autonomously to plan and execute tasks for the user.
- Build your first AI agent — Use the interactive agent builder to generate production-ready agent code.
- How agents work — Learn the agent processing cycle, memory architecture, and Redis data structures for agents.
- Redis Notebooks for LangGraph — End-to-end agent examples using LangGraph and Redis.
Context Engine
The Context Engine provides managed services for agent memory and data access.
- Get started with LangCache — Add semantic caching to reduce LLM costs in minutes.
- Get started with Agent Memory — Add persistent two-tier memory to any agent using the REST API.
- Get started with Context Retriever — Expose your business data as governed tools that agents can reliably query.
- Get started with Data Integration — Keep Redis in sync with your primary database so agents always have fresh data.
Tutorials
Need a deeper-dive through different use cases and topics?
Agents
- Agentic RAG - A tutorial focused on agentic RAG with LlamaIndex and Amazon Bedrock
- Redis Notebooks for LangGraph - Working with LangGraph agents and Redis memory
- Build a LangGraph travel agent with Redis Agent Memory - Build a LangGraph agent with short-term session memory and long-term persistent memory using Redis Agent Memory
- Build a real-time AI agent with Redis Iris - Combine Redis Agent Memory and Context Retriever to build a wealth advisor agent with persistent memory and structured data access
- Build a car dealership agent with Google ADK and Redis Agent Memory - Build a persistent AI agent using Google ADK and Redis Agent Memory Server with working and long-term memory
- Build Google ADK agents with persistent, real-time memory on Redis - Use the
adk-redispackage to integrate Google ADK with Redis for persistent memory, sessions, and semantic caching in production agents
Context Engine
- Semantic caching with Redis LangCache - Build a FastAPI app with semantic caching using LangCache to reduce LLM costs and improve response times
RAG
- RAG on Vertex AI - A RAG tutorial featuring Redis with Vertex AI
- RAG workbench - A development playground for exploring RAG techniques with Redis
- ArXiv Chat - Streamlit demo of RAG over ArXiv documents with Redis & OpenAI
Recommendations and search
- Recommendation systems w/ NVIDIA Merlin & Redis - Three examples, each escalating in complexity, showcasing the process of building a realtime recsys with NVIDIA and Redis
- Redis product search - Build a real-time product search engine using features like full-text search, vector similarity, and real-time data updates
- ArXiv Search - Full stack implementation of Redis with React FE
Vector sets
- Getting started with vector sets - Learn the fundamentals of Redis vector sets for similarity search using the
VADDandVSIMcommands - Face similarity search with Redis vector sets - Build a celebrity lookalike app using Redis vector sets and a Vision Transformer model for face embedding and similarity search
Ecosystem integrations
Explore our comprehensive ecosystem integrations page to discover how Redis works with popular AI frameworks, platforms, and tools including:
- LangGraph, LangChain, and LlamaIndex for building advanced AI applications
- Amazon Bedrock and NVIDIA NIM for enhanced AI infrastructure
- Microsoft Semantic Kernel and Kernel Memory for LLM applications
- And many more integrations to power your AI solutions
Video tutorials
Watch our AI video collection featuring practical tutorials and demonstrations on:
- Building RAG applications and implementing vector search
- Working with LangGraph for AI agents with memory
- Semantic caching and search techniques
- Redis integrations with popular AI frameworks
- Real-world AI application examples and best practices
Benchmarks
See how we stack up against the competition.
Best practices
See how leaders in the industry are building their AI apps.
Agents and architecture
- AI Agent vs Chatbot: Key Differences Explained — Understand the architectural differences between chatbots and agents and when to use each based on task complexity, cost, and latency.
- Agentic AI Architecture: 5 Patterns Explained — Learn five production agentic patterns and the data layer requirements needed to support them.
- AI Agents vs Workflows: When to Use Each — Understand the distinction between deterministic workflows and autonomous agents and how to combine them in production.
- How agents work — Agent memory patterns, data structure selection, and production deployment considerations.
Memory and context
- Context Engineering for AI: What It Is & How to Build It — Learn the discipline of designing what an LLM receives at inference time, including the four core operations and how Redis provides the infrastructure.
- Long-Term Memory Architectures for AI Agents — Design persistent memory systems that retain information across sessions, with guidance on memory types and design tradeoffs.
- Context Pruning: Cut LLM Tokens Without Losing Quality — Selectively remove low-value tokens from LLM input to reduce costs and improve quality, with benchmarks and failure modes.
Performance
- What is semantic caching — When and how to apply semantic caching in your AI applications.
- Streaming LLM Responses: Make Your AI App Feel Fast — Deliver tokens incrementally via Server-Sent Events and combine streaming with caching and context optimization in production.